Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-15 Thread serguei.spit...@oracle.com

Hi Richard,

This is on my review list. I'll try to get it reviewed by the end of 
this week.


Thanks,
Serguei


On 9/8/20 10:02, Reingruber, Richard wrote:

Hello Marty,

Sure. I'd be happy if Serguei could review the change.

Thanks, Richard.

-Original Message-
From: Marty Thompson 
Sent: Dienstag, 8. September 2020 18:55
To: Reingruber, Richard ; Daniel Daugherty 
; serviceability-dev ; 
hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime 
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hello Richard,

It would be good if Serguei Spitsyn could review before this is pushed.  
Serguei is out this week.  Can you wait until Serguei is back in the office the 
week of Sept 14?

Regards,

Marty


-Original Message-
From: Reingruber, Richard 
Sent: Tuesday, September 8, 2020 9:45 AM
To: Daniel Daugherty ; serviceability-dev
; hotspot-compiler-
d...@openjdk.java.net; Hotspot dev runtime 
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
in the Presence of JVMTI Agents

Hi Dan,

I'd be very happy about a review from somebody on the Serviceability team.
I have asked for reviews many times (kindly I hope). And the change is for
review for more than a year now.

According to [1] I'd think all requirements to push are met already. But
maybe I missed something?

After renaming of methods in SafepointMechanism the change needs to be
rebased (already done). I'll publish a pull request as soon as possible.

Thanks, Richard.

[1]
https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change

-Original Message-
From: Daniel D. Daugherty 
Sent: Dienstag, 8. September 2020 18:16
To: Reingruber, Richard ; serviceability-dev
; hotspot-compiler-
d...@openjdk.java.net; Hotspot dev runtime 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
in the Presence of JVMTI Agents

Hi Richard,

I haven't seen a review from anyone on the Serviceability team and I think
you should get a review from them since JVM/TI is involved.
Perhaps I missed it...

Dan


On 9/7/20 10:09 AM, Reingruber, Richard wrote:

Hi,

I would like to close the review of this change.

It has received a lot of helpful feedback during the process and 2
full Reviews. Thanks everybody!

I'm planning to push it this week on Thursday as solution for JBS items:

https://bugs.openjdk.java.net/browse/JDK-8227745
https://bugs.openjdk.java.net/browse/JDK-8233915

Version to be pushed:

http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/

Hope to get my GIT/Skara setup going until then... :)

Thanks, Richard.

-Original Message-
From: hotspot-compiler-dev
 On Behalf Of Reingruber,
Richard
Sent: Mittwoch, 2. September 2020 23:27
To: Robbin Ehn ; serviceability-dev
;
hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime

Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for
Better Performance in the Presence of JVMTI Agents

Hi Robin,


On 2020-09-02 15:48, Reingruber, Richard wrote:

Hi Robbin,

// taking the discussion back to the mailing lists

 > I still don't understand why you don't deoptimize the objects inside

the

 > handshake/safepoint instead?

So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

Just to make it clear: I'm not fond of the extra suspension mechanism
currently used for JDK-8227745 either. I want to get rid of it and I
will work on it. Asynch handshakes (JDK-8238761) could be a
replacement for it. At least I think they can be used to suspend the target

thread.

For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

Yes, this should be possible. This would be an advanced change though.
I would like EscapeBarriers to be a no-op and fall back to current
implementation, if C2-EscapeAnalysis/Graal are disabled.


So I suggest future work to instead just execute the safepoint with
the requesting JT instead of having a this special safepoiting mechanism.
Since you are missing above functionality I see why you went this way.
If you need to push it, it's fine by me.

We will work on further improvements. Top of the list would be
eliminating the extra suspend mechanism.

The implementation has matured for more than 12 months now [1]. It's
been tested extensively at SAP over that time and passed also extended
testing at Oracle kindly conducted by Vladimir Kozlov. We've got two
full Reviews and incorporated extensive feedback from a number of
OpenJDK Reviewers (including you, thanks!). Based on that I reckon
we're good to push the change as enhancement
(JDK-8227745) and bug fix (JDK-8233915).


Thanks for explaining once again :)

Pleasure :)

Thanks, Richard.

[1]

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-09 Thread Reingruber, Richard
> Hi Richard,

> I suspect this one fell off the radar due to the extended review period. 
> The actual review started last December (there was prior discussion 
> IIRC) and only seemed to get partial reviews. I only looked at some 
> parts. Robbin may have given things a deeper look, but seemed focused on 
> the handshake aspects. Vladimir said he would do a full review but I 
> can't find it. Eventually Martin and Goetz took over reviewing and 
> everyone else dropped off. :(

That's how it went I reckon. I repeatedly asked for feedback and reviews, and
also tried to keep Vladimir, Robbin, and you in the loop addressing you directly
(e.g. [1])

> As this covers a number of areas it really does need "approval" from 
> each area (and yes the hotspot wiki should reflect this).

I agree. The wiki should define that in a clear manner. And the community should
be involved in that definition.

> I will try to take another look while we await Serguei's return (and I 
> never did follow up on the problem I had with the nested lock 
> elimination handling. :( ).

Thanks for doing it.

> Meanwhile this will need to be converted to a PR in any case.

I hope to get the PR out later but we've got a team outing today... we haven't 
seen
each other since months... :)

Cheers, Richard.

[1] 
http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/030911.html

-Original Message-
From: David Holmes  
Sent: Mittwoch, 9. September 2020 00:29
To: Reingruber, Richard ; Marty Thompson 
; Daniel Daugherty ; 
serviceability-dev ; 
hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime 
; Robbin Ehn ; 
Vladimir Kozlov 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

I suspect this one fell off the radar due to the extended review period. 
The actual review started last December (there was prior discussion 
IIRC) and only seemed to get partial reviews. I only looked at some 
parts. Robbin may have given things a deeper look, but seemed focused on 
the handshake aspects. Vladimir said he would do a full review but I 
can't find it. Eventually Martin and Goetz took over reviewing and 
everyone else dropped off. :(

As this covers a number of areas it really does need "approval" from 
each area (and yes the hotspot wiki should reflect this).

I will try to take another look while we await Serguei's return (and I 
never did follow up on the problem I had with the nested lock 
elimination handling. :( ).

Meanwhile this will need to be converted to a PR in any case.

Thanks,
David

On 9/09/2020 3:02 am, Reingruber, Richard wrote:
> Hello Marty,
> 
> Sure. I'd be happy if Serguei could review the change.
> 
> Thanks, Richard.
> 
> -Original Message-
> From: Marty Thompson 
> Sent: Dienstag, 8. September 2020 18:55
> To: Reingruber, Richard ; Daniel Daugherty 
> ; serviceability-dev 
> ; hotspot-compiler-...@openjdk.java.net; 
> Hotspot dev runtime 
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
> the Presence of JVMTI Agents
> 
> Hello Richard,
> 
> It would be good if Serguei Spitsyn could review before this is pushed.  
> Serguei is out this week.  Can you wait until Serguei is back in the office 
> the week of Sept 14?
> 
> Regards,
> 
> Marty
> 
>> -Original Message-
>> From: Reingruber, Richard 
>> Sent: Tuesday, September 8, 2020 9:45 AM
>> To: Daniel Daugherty ; serviceability-dev
>> ; hotspot-compiler-
>> d...@openjdk.java.net; Hotspot dev runtime > d...@openjdk.java.net>
>> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
>> in the Presence of JVMTI Agents
>>
>> Hi Dan,
>>
>> I'd be very happy about a review from somebody on the Serviceability team.
>> I have asked for reviews many times (kindly I hope). And the change is for
>> review for more than a year now.
>>
>> According to [1] I'd think all requirements to push are met already. But
>> maybe I missed something?
>>
>> After renaming of methods in SafepointMechanism the change needs to be
>> rebased (already done). I'll publish a pull request as soon as possible.
>>
>> Thanks, Richard.
>>
>> [1]
>> https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change
>>
>> -Original Message-
>> From: Daniel D. Daugherty 
>> Sent: Dienstag, 8. September 2020 18:16
>> To: Reingruber, Richard ; serviceability-dev
>> ; hotspot-compiler-
>> d...@openjdk.java.net; Hotspot dev runtime > d...@openjdk.java.net>
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
>> in the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> I haven't seen a review from anyone on the Serviceability team and I think
>> you should get a review from them since JVM/TI is involved.
>> Perhaps I missed it...
>>
>> Dan
>>
>>
>> On 9/7/20 10:09 AM, Reingruber, Richard wrote:
>>> Hi,
>>>
>>> I would like to close the review of this change.
>>>
>>> It has received a lot of helpful feedback during the process 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-08 Thread David Holmes

Hi Richard,

I suspect this one fell off the radar due to the extended review period. 
The actual review started last December (there was prior discussion 
IIRC) and only seemed to get partial reviews. I only looked at some 
parts. Robbin may have given things a deeper look, but seemed focused on 
the handshake aspects. Vladimir said he would do a full review but I 
can't find it. Eventually Martin and Goetz took over reviewing and 
everyone else dropped off. :(


As this covers a number of areas it really does need "approval" from 
each area (and yes the hotspot wiki should reflect this).


I will try to take another look while we await Serguei's return (and I 
never did follow up on the problem I had with the nested lock 
elimination handling. :( ).


Meanwhile this will need to be converted to a PR in any case.

Thanks,
David

On 9/09/2020 3:02 am, Reingruber, Richard wrote:

Hello Marty,

Sure. I'd be happy if Serguei could review the change.

Thanks, Richard.

-Original Message-
From: Marty Thompson 
Sent: Dienstag, 8. September 2020 18:55
To: Reingruber, Richard ; Daniel Daugherty 
; serviceability-dev ; 
hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime 
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hello Richard,

It would be good if Serguei Spitsyn could review before this is pushed.  
Serguei is out this week.  Can you wait until Serguei is back in the office the 
week of Sept 14?

Regards,

Marty


-Original Message-
From: Reingruber, Richard 
Sent: Tuesday, September 8, 2020 9:45 AM
To: Daniel Daugherty ; serviceability-dev
; hotspot-compiler-
d...@openjdk.java.net; Hotspot dev runtime 
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
in the Presence of JVMTI Agents

Hi Dan,

I'd be very happy about a review from somebody on the Serviceability team.
I have asked for reviews many times (kindly I hope). And the change is for
review for more than a year now.

According to [1] I'd think all requirements to push are met already. But
maybe I missed something?

After renaming of methods in SafepointMechanism the change needs to be
rebased (already done). I'll publish a pull request as soon as possible.

Thanks, Richard.

[1]
https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change

-Original Message-
From: Daniel D. Daugherty 
Sent: Dienstag, 8. September 2020 18:16
To: Reingruber, Richard ; serviceability-dev
; hotspot-compiler-
d...@openjdk.java.net; Hotspot dev runtime 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
in the Presence of JVMTI Agents

Hi Richard,

I haven't seen a review from anyone on the Serviceability team and I think
you should get a review from them since JVM/TI is involved.
Perhaps I missed it...

Dan


On 9/7/20 10:09 AM, Reingruber, Richard wrote:

Hi,

I would like to close the review of this change.

It has received a lot of helpful feedback during the process and 2
full Reviews. Thanks everybody!

I'm planning to push it this week on Thursday as solution for JBS items:

https://bugs.openjdk.java.net/browse/JDK-8227745
https://bugs.openjdk.java.net/browse/JDK-8233915

Version to be pushed:

http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/

Hope to get my GIT/Skara setup going until then... :)

Thanks, Richard.

-Original Message-
From: hotspot-compiler-dev
 On Behalf Of Reingruber,
Richard
Sent: Mittwoch, 2. September 2020 23:27
To: Robbin Ehn ; serviceability-dev
;
hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime

Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for
Better Performance in the Presence of JVMTI Agents

Hi Robin,


On 2020-09-02 15:48, Reingruber, Richard wrote:

Hi Robbin,

// taking the discussion back to the mailing lists

 > I still don't understand why you don't deoptimize the objects inside

the

 > handshake/safepoint instead?

So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

Just to make it clear: I'm not fond of the extra suspension mechanism
currently used for JDK-8227745 either. I want to get rid of it and I
will work on it. Asynch handshakes (JDK-8238761) could be a
replacement for it. At least I think they can be used to suspend the target

thread.



For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

Yes, this should be possible. This would be an advanced change though.
I would like EscapeBarriers to be a no-op and fall back to current
implementation, if C2-EscapeAnalysis/Graal are disabled.


So I suggest future work to instead just execute the safepoint with
the requesting JT instead of having a this special safepoiting mechanism.
Since you are 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-08 Thread Reingruber, Richard
Hello Marty,

Sure. I'd be happy if Serguei could review the change.

Thanks, Richard.

-Original Message-
From: Marty Thompson  
Sent: Dienstag, 8. September 2020 18:55
To: Reingruber, Richard ; Daniel Daugherty 
; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; 
Hotspot dev runtime 
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hello Richard,

It would be good if Serguei Spitsyn could review before this is pushed.  
Serguei is out this week.  Can you wait until Serguei is back in the office the 
week of Sept 14?

Regards,

Marty

> -Original Message-
> From: Reingruber, Richard 
> Sent: Tuesday, September 8, 2020 9:45 AM
> To: Daniel Daugherty ; serviceability-dev
> ; hotspot-compiler-
> d...@openjdk.java.net; Hotspot dev runtime  d...@openjdk.java.net>
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Dan,
> 
> I'd be very happy about a review from somebody on the Serviceability team.
> I have asked for reviews many times (kindly I hope). And the change is for
> review for more than a year now.
> 
> According to [1] I'd think all requirements to push are met already. But
> maybe I missed something?
> 
> After renaming of methods in SafepointMechanism the change needs to be
> rebased (already done). I'll publish a pull request as soon as possible.
> 
> Thanks, Richard.
> 
> [1]
> https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change
> 
> -Original Message-
> From: Daniel D. Daugherty 
> Sent: Dienstag, 8. September 2020 18:16
> To: Reingruber, Richard ; serviceability-dev
> ; hotspot-compiler-
> d...@openjdk.java.net; Hotspot dev runtime  d...@openjdk.java.net>
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> I haven't seen a review from anyone on the Serviceability team and I think
> you should get a review from them since JVM/TI is involved.
> Perhaps I missed it...
> 
> Dan
> 
> 
> On 9/7/20 10:09 AM, Reingruber, Richard wrote:
> > Hi,
> >
> > I would like to close the review of this change.
> >
> > It has received a lot of helpful feedback during the process and 2
> > full Reviews. Thanks everybody!
> >
> > I'm planning to push it this week on Thursday as solution for JBS items:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8227745
> > https://bugs.openjdk.java.net/browse/JDK-8233915
> >
> > Version to be pushed:
> >
> > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/
> >
> > Hope to get my GIT/Skara setup going until then... :)
> >
> > Thanks, Richard.
> >
> > -Original Message-
> > From: hotspot-compiler-dev
> >  On Behalf Of Reingruber,
> > Richard
> > Sent: Mittwoch, 2. September 2020 23:27
> > To: Robbin Ehn ; serviceability-dev
> > ;
> > hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime
> > 
> > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for
> > Better Performance in the Presence of JVMTI Agents
> >
> > Hi Robin,
> >
> >> On 2020-09-02 15:48, Reingruber, Richard wrote:
> >>> Hi Robbin,
> >>>
> >>> // taking the discussion back to the mailing lists
> >>>
> >>> > I still don't understand why you don't deoptimize the objects inside
> the
> >>> > handshake/safepoint instead?
> >> So for handshakes using asynch handshake and allowing blocking inside
> >> would fix that. (future fix, I'm working on that now)
> > Just to make it clear: I'm not fond of the extra suspension mechanism
> > currently used for JDK-8227745 either. I want to get rid of it and I
> > will work on it. Asynch handshakes (JDK-8238761) could be a
> > replacement for it. At least I think they can be used to suspend the target
> thread.
> >
> >> For safepoint, since we have suspended all threads, ~'safepointed them'
> >> with a JavaThread, you _could_ just execute the action directly (e.g.
> >> skipping VM_HeapWalkOperation safepoint) since they are suppose to be
> >> safely suspended until the destructor of EB, no?
> > Yes, this should be possible. This would be an advanced change though.
> > I would like EscapeBarriers to be a no-op and fall back to current
> > implementation, if C2-EscapeAnalysis/Graal are disabled.
> >
> >> So I suggest future work to instead just execute the safepoint with
> >> the requesting JT instead of having a this special safepoiting mechanism.
> >> Since you are missing above functionality I see why you went this way.
> >> If you need to push it, it's fine by me.
> > We will work on further improvements. Top of the list would be
> > eliminating the extra suspend mechanism.
> >
> > The implementation has matured for more than 12 months now [1]. It's
> > been tested extensively at SAP over that time and passed also extended
> > testing at Oracle kindly conducted by Vladimir Kozlov. We've got two
> > full Reviews and incorporated extensive feedback from a number of
> > OpenJDK 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-08 Thread Marty Thompson
Hello Richard,

It would be good if Serguei Spitsyn could review before this is pushed.  
Serguei is out this week.  Can you wait until Serguei is back in the office the 
week of Sept 14?

Regards,

Marty

> -Original Message-
> From: Reingruber, Richard 
> Sent: Tuesday, September 8, 2020 9:45 AM
> To: Daniel Daugherty ; serviceability-dev
> ; hotspot-compiler-
> d...@openjdk.java.net; Hotspot dev runtime  d...@openjdk.java.net>
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Dan,
> 
> I'd be very happy about a review from somebody on the Serviceability team.
> I have asked for reviews many times (kindly I hope). And the change is for
> review for more than a year now.
> 
> According to [1] I'd think all requirements to push are met already. But
> maybe I missed something?
> 
> After renaming of methods in SafepointMechanism the change needs to be
> rebased (already done). I'll publish a pull request as soon as possible.
> 
> Thanks, Richard.
> 
> [1]
> https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change
> 
> -Original Message-
> From: Daniel D. Daugherty 
> Sent: Dienstag, 8. September 2020 18:16
> To: Reingruber, Richard ; serviceability-dev
> ; hotspot-compiler-
> d...@openjdk.java.net; Hotspot dev runtime  d...@openjdk.java.net>
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> I haven't seen a review from anyone on the Serviceability team and I think
> you should get a review from them since JVM/TI is involved.
> Perhaps I missed it...
> 
> Dan
> 
> 
> On 9/7/20 10:09 AM, Reingruber, Richard wrote:
> > Hi,
> >
> > I would like to close the review of this change.
> >
> > It has received a lot of helpful feedback during the process and 2
> > full Reviews. Thanks everybody!
> >
> > I'm planning to push it this week on Thursday as solution for JBS items:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8227745
> > https://bugs.openjdk.java.net/browse/JDK-8233915
> >
> > Version to be pushed:
> >
> > http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/
> >
> > Hope to get my GIT/Skara setup going until then... :)
> >
> > Thanks, Richard.
> >
> > -Original Message-
> > From: hotspot-compiler-dev
> >  On Behalf Of Reingruber,
> > Richard
> > Sent: Mittwoch, 2. September 2020 23:27
> > To: Robbin Ehn ; serviceability-dev
> > ;
> > hotspot-compiler-...@openjdk.java.net; Hotspot dev runtime
> > 
> > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for
> > Better Performance in the Presence of JVMTI Agents
> >
> > Hi Robin,
> >
> >> On 2020-09-02 15:48, Reingruber, Richard wrote:
> >>> Hi Robbin,
> >>>
> >>> // taking the discussion back to the mailing lists
> >>>
> >>> > I still don't understand why you don't deoptimize the objects inside
> the
> >>> > handshake/safepoint instead?
> >> So for handshakes using asynch handshake and allowing blocking inside
> >> would fix that. (future fix, I'm working on that now)
> > Just to make it clear: I'm not fond of the extra suspension mechanism
> > currently used for JDK-8227745 either. I want to get rid of it and I
> > will work on it. Asynch handshakes (JDK-8238761) could be a
> > replacement for it. At least I think they can be used to suspend the target
> thread.
> >
> >> For safepoint, since we have suspended all threads, ~'safepointed them'
> >> with a JavaThread, you _could_ just execute the action directly (e.g.
> >> skipping VM_HeapWalkOperation safepoint) since they are suppose to be
> >> safely suspended until the destructor of EB, no?
> > Yes, this should be possible. This would be an advanced change though.
> > I would like EscapeBarriers to be a no-op and fall back to current
> > implementation, if C2-EscapeAnalysis/Graal are disabled.
> >
> >> So I suggest future work to instead just execute the safepoint with
> >> the requesting JT instead of having a this special safepoiting mechanism.
> >> Since you are missing above functionality I see why you went this way.
> >> If you need to push it, it's fine by me.
> > We will work on further improvements. Top of the list would be
> > eliminating the extra suspend mechanism.
> >
> > The implementation has matured for more than 12 months now [1]. It's
> > been tested extensively at SAP over that time and passed also extended
> > testing at Oracle kindly conducted by Vladimir Kozlov. We've got two
> > full Reviews and incorporated extensive feedback from a number of
> > OpenJDK Reviewers (including you, thanks!). Based on that I reckon
> > we're good to push the change as enhancement
> > (JDK-8227745) and bug fix (JDK-8233915).
> >
> >> Thanks for explaining once again :)
> > Pleasure :)
> >
> > Thanks, Richard.
> >
> > [1]
> > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/02
> > 8729.html
> >
> > -Original Message-
> > From: Robbin Ehn 
> > Sent: Mittwoch, 2. 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-08 Thread Reingruber, Richard
Hi Dan,

I'd be very happy about a review from somebody on the Serviceability team. I
have asked for reviews many times (kindly I hope). And the change is for review
for more than a year now.

According to [1] I'd think all requirements to push are met already. But maybe I
missed something?

After renaming of methods in SafepointMechanism the change needs to be rebased
(already done). I'll publish a pull request as soon as possible.

Thanks, Richard.

[1] https://wiki.openjdk.java.net/display/HotSpot/Pushing+a+HotSpot+change

-Original Message-
From: Daniel D. Daugherty  
Sent: Dienstag, 8. September 2020 18:16
To: Reingruber, Richard ; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; 
Hotspot dev runtime 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

I haven't seen a review from anyone on the Serviceability team and I
think you should get a review from them since JVM/TI is involved.
Perhaps I missed it...

Dan


On 9/7/20 10:09 AM, Reingruber, Richard wrote:
> Hi,
>
> I would like to close the review of this change.
>
> It has received a lot of helpful feedback during the process and 2 full
> Reviews. Thanks everybody!
>
> I'm planning to push it this week on Thursday as solution for JBS items:
>
> https://bugs.openjdk.java.net/browse/JDK-8227745
> https://bugs.openjdk.java.net/browse/JDK-8233915
>
> Version to be pushed:
>
> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/
>
> Hope to get my GIT/Skara setup going until then... :)
>
> Thanks, Richard.
>
> -Original Message-
> From: hotspot-compiler-dev  On 
> Behalf Of Reingruber, Richard
> Sent: Mittwoch, 2. September 2020 23:27
> To: Robbin Ehn ; serviceability-dev 
> ; hotspot-compiler-...@openjdk.java.net; 
> Hotspot dev runtime 
> Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better 
> Performance in the Presence of JVMTI Agents
>
> Hi Robin,
>
>> On 2020-09-02 15:48, Reingruber, Richard wrote:
>>> Hi Robbin,
>>>
>>> // taking the discussion back to the mailing lists
>>>
>>> > I still don't understand why you don't deoptimize the objects inside 
>>> the
>>> > handshake/safepoint instead?
>> So for handshakes using asynch handshake and allowing blocking inside
>> would fix that. (future fix, I'm working on that now)
> Just to make it clear: I'm not fond of the extra suspension mechanism 
> currently
> used for JDK-8227745 either. I want to get rid of it and I will work on it. 
> Asynch
> handshakes (JDK-8238761) could be a replacement for it. At least I think they
> can be used to suspend the target thread.
>
>> For safepoint, since we have suspended all threads, ~'safepointed them'
>> with a JavaThread, you _could_ just execute the action directly (e.g.
>> skipping VM_HeapWalkOperation safepoint) since they are suppose to be
>> safely suspended until the destructor of EB, no?
> Yes, this should be possible. This would be an advanced change though. I would
> like EscapeBarriers to be a no-op and fall back to current implementation, if
> C2-EscapeAnalysis/Graal are disabled.
>
>> So I suggest future work to instead just execute the safepoint with the
>> requesting JT instead of having a this special safepoiting mechanism.
>> Since you are missing above functionality I see why you went this way.
>> If you need to push it, it's fine by me.
> We will work on further improvements. Top of the list would
> be eliminating the extra suspend mechanism.
>
> The implementation has matured for more than 12 months now [1]. It's been 
> tested
> extensively at SAP over that time and passed also extended testing at Oracle
> kindly conducted by Vladimir Kozlov. We've got two full Reviews and 
> incorporated
> extensive feedback from a number of OpenJDK Reviewers (including you,
> thanks!). Based on that I reckon we're good to push the change as enhancement
> (JDK-8227745) and bug fix (JDK-8233915).
>
>> Thanks for explaining once again :)
> Pleasure :)
>
> Thanks, Richard.
>
> [1] 
> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html
>
> -Original Message-
> From: Robbin Ehn 
> Sent: Mittwoch, 2. September 2020 16:54
> To: Reingruber, Richard ; serviceability-dev 
> ; hotspot-compiler-...@openjdk.java.net; 
> Hotspot dev runtime 
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
> the Presence of JVMTI Agents
>
> Hi Richard,
>
> On 2020-09-02 15:48, Reingruber, Richard wrote:
>> Hi Robbin,
>>
>> // taking the discussion back to the mailing lists
>>
>> > I still don't understand why you don't deoptimize the objects inside 
>> the
>> > handshake/safepoint instead?
> So for handshakes using asynch handshake and allowing blocking inside
> would fix that. (future fix, I'm working on that now)
>
> For safepoint, since we have suspended all threads, ~'safepointed them'
> with a JavaThread, you _could_ just execute the action directly (e.g.
> skipping 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-08 Thread Daniel D. Daugherty

Hi Richard,

I haven't seen a review from anyone on the Serviceability team and I
think you should get a review from them since JVM/TI is involved.
Perhaps I missed it...

Dan


On 9/7/20 10:09 AM, Reingruber, Richard wrote:

Hi,

I would like to close the review of this change.

It has received a lot of helpful feedback during the process and 2 full
Reviews. Thanks everybody!

I'm planning to push it this week on Thursday as solution for JBS items:

https://bugs.openjdk.java.net/browse/JDK-8227745
https://bugs.openjdk.java.net/browse/JDK-8233915

Version to be pushed:

http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/

Hope to get my GIT/Skara setup going until then... :)

Thanks, Richard.

-Original Message-
From: hotspot-compiler-dev  On 
Behalf Of Reingruber, Richard
Sent: Mittwoch, 2. September 2020 23:27
To: Robbin Ehn ; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; Hotspot dev 
runtime 
Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better 
Performance in the Presence of JVMTI Agents

Hi Robin,


On 2020-09-02 15:48, Reingruber, Richard wrote:

Hi Robbin,

// taking the discussion back to the mailing lists

> I still don't understand why you don't deoptimize the objects inside the
> handshake/safepoint instead?

So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

Just to make it clear: I'm not fond of the extra suspension mechanism currently
used for JDK-8227745 either. I want to get rid of it and I will work on it. 
Asynch
handshakes (JDK-8238761) could be a replacement for it. At least I think they
can be used to suspend the target thread.


For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

Yes, this should be possible. This would be an advanced change though. I would
like EscapeBarriers to be a no-op and fall back to current implementation, if
C2-EscapeAnalysis/Graal are disabled.


So I suggest future work to instead just execute the safepoint with the
requesting JT instead of having a this special safepoiting mechanism.
Since you are missing above functionality I see why you went this way.
If you need to push it, it's fine by me.

We will work on further improvements. Top of the list would
be eliminating the extra suspend mechanism.

The implementation has matured for more than 12 months now [1]. It's been tested
extensively at SAP over that time and passed also extended testing at Oracle
kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated
extensive feedback from a number of OpenJDK Reviewers (including you,
thanks!). Based on that I reckon we're good to push the change as enhancement
(JDK-8227745) and bug fix (JDK-8233915).


Thanks for explaining once again :)

Pleasure :)

Thanks, Richard.

[1] 
http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html

-Original Message-
From: Robbin Ehn 
Sent: Mittwoch, 2. September 2020 16:54
To: Reingruber, Richard ; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; Hotspot dev 
runtime 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 2020-09-02 15:48, Reingruber, Richard wrote:

Hi Robbin,

// taking the discussion back to the mailing lists

> I still don't understand why you don't deoptimize the objects inside the
> handshake/safepoint instead?

So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

So I suggest future work to instead just execute the safepoint with the
requesting JT instead of having a this special safepoiting mechanism.

Since you are missing above functionality I see why you went this way.
If you need to push it, it's fine by me.

Thanks for explaining once again :)

/Robbin


This is unfortunately not possible. Deoptimizing objects includes reallocating
scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This
cannot be done at a safepoint or handshake.

1. The vm thread is not allowed to allocate on the java heap
 See for instance assertions in ParallelScavengeHeap::mem_allocate()
 
https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$

 This is not easy to change, I suppose, because it 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-07 Thread Reingruber, Richard
Hi,

I would like to close the review of this change.

It has received a lot of helpful feedback during the process and 2 full
Reviews. Thanks everybody!

I'm planning to push it this week on Thursday as solution for JBS items:

https://bugs.openjdk.java.net/browse/JDK-8227745
https://bugs.openjdk.java.net/browse/JDK-8233915

Version to be pushed:

http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/

Hope to get my GIT/Skara setup going until then... :)

Thanks, Richard.

-Original Message-
From: hotspot-compiler-dev  On 
Behalf Of Reingruber, Richard
Sent: Mittwoch, 2. September 2020 23:27
To: Robbin Ehn ; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; 
Hotspot dev runtime 
Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better 
Performance in the Presence of JVMTI Agents

Hi Robin,

> On 2020-09-02 15:48, Reingruber, Richard wrote:
> > Hi Robbin,
> > 
> > // taking the discussion back to the mailing lists
> > 
> >> I still don't understand why you don't deoptimize the objects inside 
> > the
> >> handshake/safepoint instead?

> So for handshakes using asynch handshake and allowing blocking inside
> would fix that. (future fix, I'm working on that now)

Just to make it clear: I'm not fond of the extra suspension mechanism currently
used for JDK-8227745 either. I want to get rid of it and I will work on it. 
Asynch
handshakes (JDK-8238761) could be a replacement for it. At least I think they
can be used to suspend the target thread.

> For safepoint, since we have suspended all threads, ~'safepointed them'
> with a JavaThread, you _could_ just execute the action directly (e.g.
> skipping VM_HeapWalkOperation safepoint) since they are suppose to be
> safely suspended until the destructor of EB, no?

Yes, this should be possible. This would be an advanced change though. I would
like EscapeBarriers to be a no-op and fall back to current implementation, if
C2-EscapeAnalysis/Graal are disabled.

> So I suggest future work to instead just execute the safepoint with the
> requesting JT instead of having a this special safepoiting mechanism.

> Since you are missing above functionality I see why you went this way.
> If you need to push it, it's fine by me.

We will work on further improvements. Top of the list would
be eliminating the extra suspend mechanism.

The implementation has matured for more than 12 months now [1]. It's been tested
extensively at SAP over that time and passed also extended testing at Oracle
kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated
extensive feedback from a number of OpenJDK Reviewers (including you,
thanks!). Based on that I reckon we're good to push the change as enhancement
(JDK-8227745) and bug fix (JDK-8233915).

> Thanks for explaining once again :)

Pleasure :)

Thanks, Richard.

[1] 
http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html

-Original Message-
From: Robbin Ehn  
Sent: Mittwoch, 2. September 2020 16:54
To: Reingruber, Richard ; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; 
Hotspot dev runtime 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 2020-09-02 15:48, Reingruber, Richard wrote:
> Hi Robbin,
> 
> // taking the discussion back to the mailing lists
> 
>> I still don't understand why you don't deoptimize the objects inside the
>> handshake/safepoint instead?

So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

So I suggest future work to instead just execute the safepoint with the
requesting JT instead of having a this special safepoiting mechanism.

Since you are missing above functionality I see why you went this way.
If you need to push it, it's fine by me.

Thanks for explaining once again :)

/Robbin

> 
> This is unfortunately not possible. Deoptimizing objects includes reallocating
> scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This
> cannot be done at a safepoint or handshake.
> 
> 1. The vm thread is not allowed to allocate on the java heap
> See for instance assertions in ParallelScavengeHeap::mem_allocate()
> 
> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$
> 
> This is not easy to change, I suppose, because it will be difficult to gc 
> if
> necessary.
> 
> 2. Using a direct handshake would not work either. The problem there is again
> gc. Let J be the 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-02 Thread Reingruber, Richard
Hi Robin,

> On 2020-09-02 15:48, Reingruber, Richard wrote:
> > Hi Robbin,
> > 
> > // taking the discussion back to the mailing lists
> > 
> >> I still don't understand why you don't deoptimize the objects inside 
> > the
> >> handshake/safepoint instead?

> So for handshakes using asynch handshake and allowing blocking inside
> would fix that. (future fix, I'm working on that now)

Just to make it clear: I'm not fond of the extra suspension mechanism currently
used for JDK-8227745 either. I want to get rid of it and I will work on it. 
Asynch
handshakes (JDK-8238761) could be a replacement for it. At least I think they
can be used to suspend the target thread.

> For safepoint, since we have suspended all threads, ~'safepointed them'
> with a JavaThread, you _could_ just execute the action directly (e.g.
> skipping VM_HeapWalkOperation safepoint) since they are suppose to be
> safely suspended until the destructor of EB, no?

Yes, this should be possible. This would be an advanced change though. I would
like EscapeBarriers to be a no-op and fall back to current implementation, if
C2-EscapeAnalysis/Graal are disabled.

> So I suggest future work to instead just execute the safepoint with the
> requesting JT instead of having a this special safepoiting mechanism.

> Since you are missing above functionality I see why you went this way.
> If you need to push it, it's fine by me.

We will work on further improvements. Top of the list would
be eliminating the extra suspend mechanism.

The implementation has matured for more than 12 months now [1]. It's been tested
extensively at SAP over that time and passed also extended testing at Oracle
kindly conducted by Vladimir Kozlov. We've got two full Reviews and incorporated
extensive feedback from a number of OpenJDK Reviewers (including you,
thanks!). Based on that I reckon we're good to push the change as enhancement
(JDK-8227745) and bug fix (JDK-8233915).

> Thanks for explaining once again :)

Pleasure :)

Thanks, Richard.

[1] 
http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-July/028729.html

-Original Message-
From: Robbin Ehn  
Sent: Mittwoch, 2. September 2020 16:54
To: Reingruber, Richard ; serviceability-dev 
; hotspot-compiler-...@openjdk.java.net; 
Hotspot dev runtime 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 2020-09-02 15:48, Reingruber, Richard wrote:
> Hi Robbin,
> 
> // taking the discussion back to the mailing lists
> 
>> I still don't understand why you don't deoptimize the objects inside the
>> handshake/safepoint instead?

So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

So I suggest future work to instead just execute the safepoint with the
requesting JT instead of having a this special safepoiting mechanism.

Since you are missing above functionality I see why you went this way.
If you need to push it, it's fine by me.

Thanks for explaining once again :)

/Robbin

> 
> This is unfortunately not possible. Deoptimizing objects includes reallocating
> scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This
> cannot be done at a safepoint or handshake.
> 
> 1. The vm thread is not allowed to allocate on the java heap
> See for instance assertions in ParallelScavengeHeap::mem_allocate()
> 
> https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$
> 
> This is not easy to change, I suppose, because it will be difficult to gc 
> if
> necessary.
> 
> 2. Using a direct handshake would not work either. The problem there is again
> gc. Let J be the JavaThread that is executing the direct handshake. The vm
> would deadlock if the vm thread waits for J to execute the closure of a
> handshake-all and J waits for the vm thread to execute a gc vm operation.
> Patricio Chilano made me aware of this: 
> https://bugs.openjdk.java.net/browse/JDK-8230594
> 
> Cheers, Richard.
> 
> -Original Message-
> From: Robbin Ehn 
> Sent: Mittwoch, 2. September 2020 13:56
> To: Reingruber, Richard 
> Cc: Lindenmaier, Goetz ; Vladimir Kozlov 
> ; David Holmes 
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
> the Presence of JVMTI Agents
> 
> Hi,
> 
> I still don't understand why you don't deoptimize the objects inside the
> handshake/safepoint instead?
> 
> E.g.
> 
> JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-02 Thread Robbin Ehn

Hi Richard,

On 2020-09-02 15:48, Reingruber, Richard wrote:

Hi Robbin,

// taking the discussion back to the mailing lists

   > I still don't understand why you don't deoptimize the objects inside the
   > handshake/safepoint instead?


So for handshakes using asynch handshake and allowing blocking inside
would fix that. (future fix, I'm working on that now)

For safepoint, since we have suspended all threads, ~'safepointed them'
with a JavaThread, you _could_ just execute the action directly (e.g.
skipping VM_HeapWalkOperation safepoint) since they are suppose to be
safely suspended until the destructor of EB, no?

So I suggest future work to instead just execute the safepoint with the
requesting JT instead of having a this special safepoiting mechanism.

Since you are missing above functionality I see why you went this way.
If you need to push it, it's fine by me.

Thanks for explaining once again :)

/Robbin



This is unfortunately not possible. Deoptimizing objects includes reallocating
scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This
cannot be done at a safepoint or handshake.

1. The vm thread is not allowed to allocate on the java heap
See for instance assertions in ParallelScavengeHeap::mem_allocate()

https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp*L258__;Iw!!GqivPVa7Brio!K0f5chjtePI6MKBSBOoBKya9YZTJlVhsExQYMDO96v3Af_Klc_E4R26_dSyowotF$

This is not easy to change, I suppose, because it will be difficult to gc if
necessary.

2. Using a direct handshake would not work either. The problem there is again
gc. Let J be the JavaThread that is executing the direct handshake. The vm
would deadlock if the vm thread waits for J to execute the closure of a
handshake-all and J waits for the vm thread to execute a gc vm operation.
Patricio Chilano made me aware of this: 
https://bugs.openjdk.java.net/browse/JDK-8230594

Cheers, Richard.

-Original Message-
From: Robbin Ehn 
Sent: Mittwoch, 2. September 2020 13:56
To: Reingruber, Richard 
Cc: Lindenmaier, Goetz ; Vladimir Kozlov 
; David Holmes 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi,

I still don't understand why you don't deoptimize the objects inside the
handshake/safepoint instead?

E.g.

JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code
from:
eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the
stack, so:

void
GetOwnedMonitorInfoClosure::do_thread(Thread *target) {
assert(target->is_Java_thread(), "just checking");
JavaThread *jt = (JavaThread *)target;

if (!jt->is_exiting() && (jt->threadObj() != NULL)) {
+if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) {
_result =
((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt,
_owned_monitors_list);
  } else {
_result = JVMTI_ERROR_OUT_OF_MEMORY;
  }
}
}

Why try 'suspend' the thread first?


When we de-optimize all threads why not just in the following safepoint?
E.g.
VM_HeapWalkOperation::doit() {
+ EscapeBarrier::deoptimize_objects_all_threads();
...
}

Thanks, Robbin
  



RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-09-02 Thread Reingruber, Richard
Hi Robbin,

// taking the discussion back to the mailing lists

  > I still don't understand why you don't deoptimize the objects inside the 
  > handshake/safepoint instead?

This is unfortunately not possible. Deoptimizing objects includes reallocating
scalar replaced objects, i.e. calling Deoptimization::realloc_objects(). This
cannot be done at a safepoint or handshake.

1. The vm thread is not allowed to allocate on the java heap
   See for instance assertions in ParallelScavengeHeap::mem_allocate()
   
https://github.com/openjdk/jdk/blob/4c73e045ce815d52abcdc99499266ccf2e6e9b4c/src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp#L258

   This is not easy to change, I suppose, because it will be difficult to gc if
   necessary.

2. Using a direct handshake would not work either. The problem there is again
   gc. Let J be the JavaThread that is executing the direct handshake. The vm
   would deadlock if the vm thread waits for J to execute the closure of a
   handshake-all and J waits for the vm thread to execute a gc vm operation.
   Patricio Chilano made me aware of this: 
https://bugs.openjdk.java.net/browse/JDK-8230594

Cheers, Richard.

-Original Message-
From: Robbin Ehn  
Sent: Mittwoch, 2. September 2020 13:56
To: Reingruber, Richard 
Cc: Lindenmaier, Goetz ; Vladimir Kozlov 
; David Holmes 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi,

I still don't understand why you don't deoptimize the objects inside the 
handshake/safepoint instead?

E.g.

JvmtiEnv::GetOwnedMonitorInfo you only should need the execute the code 
from:
eb.deoptimize_objects(MaxJavaStackTraceDepth)) before looping over the 
stack, so:

void
GetOwnedMonitorInfoClosure::do_thread(Thread *target) {
   assert(target->is_Java_thread(), "just checking");
   JavaThread *jt = (JavaThread *)target;

   if (!jt->is_exiting() && (jt->threadObj() != NULL)) {
+if (EscapeBarrier::deoptimize_objects(jt, MaxJavaStackTraceDepth)) {
   _result = 
((JvmtiEnvBase*)_env)->get_owned_monitors(_calling_thread, jt, 
_owned_monitors_list);
 } else {
   _result = JVMTI_ERROR_OUT_OF_MEMORY;
 }
   }
}

Why try 'suspend' the thread first?


When we de-optimize all threads why not just in the following safepoint?
E.g.
VM_HeapWalkOperation::doit() {
+ EscapeBarrier::deoptimize_objects_all_threads();
...
}

Thanks, Robbin
 


RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-08-28 Thread Reingruber, Richard
Thanks a lot!

Richard.

-Original Message-
From: Lindenmaier, Goetz  
Sent: Freitag, 28. August 2020 08:38
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard, 

Thanks for the new webrev. 

The small improvements are fine, too.
Reviewed from my side.

Best regards,
  Goetz.

> -Original Message-
> From: Reingruber, Richard 
> Sent: Thursday, August 27, 2020 10:33 PM
> To: Lindenmaier, Goetz ; serviceability-
> d...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-
> runtime-...@openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Goetz,
> 
> > I read through your change again. It looks good to me now.
> > The new naming and additional comments make it
> > easier to read I think, thank you.
> 
> Thanks for all your input!
> 
> > One small thing:
> > deoptimization.cpp, l. 1503
> > You don't really need the brackets. Two lines below you don't use them
> either.
> > (No webrev needed)
> 
> Thanks for providing the correct line off list. Fixed!
> 
> I prepared a new webrev, because I had to rebase after JDK-8249293 [1] and
> because I wanted to make use of JDK-8251384 [2]
> 
> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/
> Delta:  http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/
> 
> The delta looks bigger than it is. Most of it is re-indentation of
> VM_GetOrSetLocal::deoptimize_objects(). You can see this if you look at
> 
> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/src/hotsp
> ot/share/prims/jvmtiImpl.cpp.udiff.html
> 
> which does not include the whitespace change.
> 
> Hope you are still ok with webrev.8. The changes are marginal. I've
> commented
> each below.
> 
> Thanks, Richard.
> 
> --- Details below ---
> 
> src/hotspot/share/prims/jvmtiImpl.cpp
> 
> @@ -425,11 +425,11 @@
>, _depth(depth)
>, _index(index)
>, _type(type)
>, _jvf(NULL)
>, _set(false)
> -  , _eb(NULL, NULL, false) // no references escape
> +  , _eb(NULL, NULL, type == T_OBJECT)
>, _result(JVMTI_ERROR_NONE)
> 
> Currently 'type' is never equal to T_OBJECT at this location, still I think it
> is better to check. The compiler will replace the compare with false.
> 
> @@ -630,11 +630,11 @@
>  }
> 
>  // Revert optimizations based on escape analysis if this is an access to a
> local object
>  bool VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf) {
>  #if COMPILER2_OR_JVMCI
> -  if (NOT_JVMCI(DoEscapeAnalysis &&) _type == T_OBJECT) {
> +  assert(_type == T_OBJECT, "EscapeBarrier should not be active if _type !=
> T_OBJECT");
> 
> I removed the if from VM_GetOrSetLocal::deoptimize_objects(), because
> now it
> only gets called if the VM_GetOrSetLocal instance has an active
> EscapeBarrier
> which will be the case iff the local type is T_OBJECT and if either C2 escape
> analysis is enabled or Graal is used.
> 
> src/hotspot/share/runtime/deoptimization.cpp
> 
> You suggested to remove the braces. Done.
> 
> src/hotspot/share/runtime/deoptimization.hpp
> 
> Must provide definition of EscapeBarrier::barrier_active() for new call site 
> in
> VM_GetOrSetLocal::doit_prologue() if building with COMPILER2_OR_JVMCI
> not
> defined.
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis
> Enabled.java
> 
> Make use of [2] and pass test with minimal vm.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8249293
> [2] https://bugs.openjdk.java.net/browse/JDK-8251384
> 
> -Original Message-
> From: Lindenmaier, Goetz 
> Sent: Samstag, 22. August 2020 07:46
> To: Reingruber, Richard ; serviceability-
> d...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-
> runtime-...@openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> I read through your change again. It looks good to me now.
> The new naming and additional comments make it
> easier to read I think, thank you.
> 
> One small thing:
> deoptimization.cpp, l. 1503
> You don't really need the brackets. Two lines below you don't use them
> either.
> (No webrev needed)
> 
> Best regards,
>   Goetz.



RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-08-28 Thread Lindenmaier, Goetz
Hi Richard, 

Thanks for the new webrev. 

The small improvements are fine, too.
Reviewed from my side.

Best regards,
  Goetz.

> -Original Message-
> From: Reingruber, Richard 
> Sent: Thursday, August 27, 2020 10:33 PM
> To: Lindenmaier, Goetz ; serviceability-
> d...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-
> runtime-...@openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Goetz,
> 
> > I read through your change again. It looks good to me now.
> > The new naming and additional comments make it
> > easier to read I think, thank you.
> 
> Thanks for all your input!
> 
> > One small thing:
> > deoptimization.cpp, l. 1503
> > You don't really need the brackets. Two lines below you don't use them
> either.
> > (No webrev needed)
> 
> Thanks for providing the correct line off list. Fixed!
> 
> I prepared a new webrev, because I had to rebase after JDK-8249293 [1] and
> because I wanted to make use of JDK-8251384 [2]
> 
> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/
> Delta:  http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/
> 
> The delta looks bigger than it is. Most of it is re-indentation of
> VM_GetOrSetLocal::deoptimize_objects(). You can see this if you look at
> 
> http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/src/hotsp
> ot/share/prims/jvmtiImpl.cpp.udiff.html
> 
> which does not include the whitespace change.
> 
> Hope you are still ok with webrev.8. The changes are marginal. I've
> commented
> each below.
> 
> Thanks, Richard.
> 
> --- Details below ---
> 
> src/hotspot/share/prims/jvmtiImpl.cpp
> 
> @@ -425,11 +425,11 @@
>, _depth(depth)
>, _index(index)
>, _type(type)
>, _jvf(NULL)
>, _set(false)
> -  , _eb(NULL, NULL, false) // no references escape
> +  , _eb(NULL, NULL, type == T_OBJECT)
>, _result(JVMTI_ERROR_NONE)
> 
> Currently 'type' is never equal to T_OBJECT at this location, still I think it
> is better to check. The compiler will replace the compare with false.
> 
> @@ -630,11 +630,11 @@
>  }
> 
>  // Revert optimizations based on escape analysis if this is an access to a
> local object
>  bool VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf) {
>  #if COMPILER2_OR_JVMCI
> -  if (NOT_JVMCI(DoEscapeAnalysis &&) _type == T_OBJECT) {
> +  assert(_type == T_OBJECT, "EscapeBarrier should not be active if _type !=
> T_OBJECT");
> 
> I removed the if from VM_GetOrSetLocal::deoptimize_objects(), because
> now it
> only gets called if the VM_GetOrSetLocal instance has an active
> EscapeBarrier
> which will be the case iff the local type is T_OBJECT and if either C2 escape
> analysis is enabled or Graal is used.
> 
> src/hotspot/share/runtime/deoptimization.cpp
> 
> You suggested to remove the braces. Done.
> 
> src/hotspot/share/runtime/deoptimization.hpp
> 
> Must provide definition of EscapeBarrier::barrier_active() for new call site 
> in
> VM_GetOrSetLocal::doit_prologue() if building with COMPILER2_OR_JVMCI
> not
> defined.
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis
> Enabled.java
> 
> Make use of [2] and pass test with minimal vm.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8249293
> [2] https://bugs.openjdk.java.net/browse/JDK-8251384
> 
> -Original Message-
> From: Lindenmaier, Goetz 
> Sent: Samstag, 22. August 2020 07:46
> To: Reingruber, Richard ; serviceability-
> d...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-
> runtime-...@openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> I read through your change again. It looks good to me now.
> The new naming and additional comments make it
> easier to read I think, thank you.
> 
> One small thing:
> deoptimization.cpp, l. 1503
> You don't really need the brackets. Two lines below you don't use them
> either.
> (No webrev needed)
> 
> Best regards,
>   Goetz.



RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-08-27 Thread Reingruber, Richard
Hi Goetz,

> I read through your change again. It looks good to me now.
> The new naming and additional comments make it 
> easier to read I think, thank you.

Thanks for all your input!

> One small thing:
> deoptimization.cpp, l. 1503
> You don't really need the brackets. Two lines below you don't use them either.
> (No webrev needed)

Thanks for providing the correct line off list. Fixed!

I prepared a new webrev, because I had to rebase after JDK-8249293 [1] and
because I wanted to make use of JDK-8251384 [2]

Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8/
Delta:  http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/

The delta looks bigger than it is. Most of it is re-indentation of
VM_GetOrSetLocal::deoptimize_objects(). You can see this if you look at

http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.8.inc/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html

which does not include the whitespace change.

Hope you are still ok with webrev.8. The changes are marginal. I've commented
each below.

Thanks, Richard.

--- Details below ---

src/hotspot/share/prims/jvmtiImpl.cpp

@@ -425,11 +425,11 @@
   , _depth(depth)
   , _index(index)
   , _type(type)
   , _jvf(NULL)
   , _set(false)
-  , _eb(NULL, NULL, false) // no references escape
+  , _eb(NULL, NULL, type == T_OBJECT)
   , _result(JVMTI_ERROR_NONE)

Currently 'type' is never equal to T_OBJECT at this location, still I think it
is better to check. The compiler will replace the compare with false.

@@ -630,11 +630,11 @@
 }
 
 // Revert optimizations based on escape analysis if this is an access to a 
local object
 bool VM_GetOrSetLocal::deoptimize_objects(javaVFrame* jvf) {
 #if COMPILER2_OR_JVMCI
-  if (NOT_JVMCI(DoEscapeAnalysis &&) _type == T_OBJECT) {
+  assert(_type == T_OBJECT, "EscapeBarrier should not be active if _type != 
T_OBJECT");

I removed the if from VM_GetOrSetLocal::deoptimize_objects(), because now it
only gets called if the VM_GetOrSetLocal instance has an active EscapeBarrier
which will be the case iff the local type is T_OBJECT and if either C2 escape
analysis is enabled or Graal is used.

src/hotspot/share/runtime/deoptimization.cpp

You suggested to remove the braces. Done.

src/hotspot/share/runtime/deoptimization.hpp

Must provide definition of EscapeBarrier::barrier_active() for new call site in
VM_GetOrSetLocal::doit_prologue() if building with COMPILER2_OR_JVMCI not
defined.

test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java

Make use of [2] and pass test with minimal vm.

[1] https://bugs.openjdk.java.net/browse/JDK-8249293
[2] https://bugs.openjdk.java.net/browse/JDK-8251384

-Original Message-
From: Lindenmaier, Goetz  
Sent: Samstag, 22. August 2020 07:46
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

I read through your change again. It looks good to me now.
The new naming and additional comments make it 
easier to read I think, thank you.

One small thing:
deoptimization.cpp, l. 1503
You don't really need the brackets. Two lines below you don't use them either.
(No webrev needed)

Best regards,
  Goetz.



RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-08-21 Thread Lindenmaier, Goetz
Hi Richard,

I read through your change again. It looks good to me now.
The new naming and additional comments make it 
easier to read I think, thank you.

One small thing:
deoptimization.cpp, l. 1503
You don't really need the brackets. Two lines below you don't use them either.
(No webrev needed)

Best regards,
  Goetz.



-Original Message-
From: Reingruber, Richard  
Sent: Dienstag, 18. August 2020 10:44
To: Lindenmaier, Goetz ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Goetz,

I have collected the changes based on your feedback in a new webrev:

Webrev.7: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7/
Delta:http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7.inc/

Most of the changes are renamings, commenting, and reformatting.

Besides that ...

- I converted the native agent of the test IterateHeapWithEscapeAnalysisEnabled
  from C to C++, because this seems to be preferred by serviceability
  developers. I also re-indented the file, but excluded this from the delta
  webrev.

- I had to adapt test/jdk/com/sun/jdi/EATests.java to the fact that background
  compilation (-Xbatch) cannot be reliably disabled for JVMCI
  compilers. E.g. the compile broker will compile in the background if JVMCI is
  not yet fully initialized. Therefore it is possible that test cases are
  executed before the main test method is compiled on the highest level and then
  the test case fails. The higher the system load the higher the probability for
  this to happen. In webrev.7 I skip the compilation level check if the vm is
  configured to use the JVMCI compiler.

I also answered you inline below.

Thanks,
Richard.

-Original Message-
From: Lindenmaier, Goetz  
Sent: Donnerstag, 23. Juli 2020 16:20
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard, 

Thanks for your two further explanations in the other thread. 
That made the points clear to me.

> > I was not that happy with the names saying not_global_escape
> > and similar. I now agreed you have to use the terms of the escape
> > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not 
> > happy with
> > the 'not' in the term, I always try to expand the name to some
> > sentence with a negated verb, but it makes no sense.
> > For example, "has_not_global_escape_in_scope" expands to
> > "Hasn't a global escape in its scope." in my thinking, which makes
> > no sense. You probably mean
> > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape}
> > in its scope."
> 
> > C2 is using the word "non" in this context, e.g., here
> > alloc->is_non_escaping.
> 
> There is also ConnectionGraph::not_global_escape()
That talks about a single node that represents a single 
Object. An object has a single state wrt. ea.
You use the term for safepoint which tracks a set of objects.
Here, has_not_global_excape can mean
  1. None of the several objects does escape globaly.
  2. There is at least one object that escapes globaly.

> > non obviously negates the adjective 'global',
> > non-global or nonglobal even is a English term I find in the
> > net.
> > So what about "has_non_global_escape_in_scope?"
> 
> And what about has_ea_local_in_scope?
That's good. Please document somewhere that 
Ea_local == ArgEscape | NoEscape.
That's what it is, right?

> > Does jvmti specify that the same limits are used ...?
> > ok on your side.
> 
> I don't know and didn't find anything in a quick search.
Ok, not your business.

> 
> > jvmtiEnvBase.cpp  ok
> > jvmtiImpl.h|cpp  ok
> > jvmtiTagMap.cpp ok
> > whitebox.cpp ok
> 
> > deoptimization.cpp
> 
> > line 177: Please break line
> > line 246, 281: Please break line
> > 1578, 1583, 1589, 1632, 1649, 1651 Break line
> 
> > 1651: You use 'non'-terms, too: non-escaping :)
> 
> I know :) At least here it is wrong I'd say. "...has to be a not escaping 
> obj..."
> sounds better
> (hopefully not only to my german ears).
I thought the term non-escpaing makes it quite clear.
I just wanted to point out that using non above would
be similar to the wording here.

> > IterateHeapWithEscapeAnalysisEnabled.java
> 
> > line 415:
> > msg("wait until target thread has set testMethod_result");
> > while (testMethod_result == 0) {
> > Thread.sleep(50);
> > }
> > Might the test run into timeouts at this place?
> > The field is volatile, i.e. it will be reloaded
> > in each iteration. But will dontinline_testMethod
> > write it back to main memory in time?
> 
> You mean, the test could hang in that loop for a couple of minutes? I don't
> think so. There are cache coherence protocols in place which will invalidate
> 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-08-18 Thread Reingruber, Richard
Hi Goetz,

I have collected the changes based on your feedback in a new webrev:

Webrev.7: http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7/
Delta:http://cr.openjdk.java.net/~rrich/webrevs/8227745/webrev.7.inc/

Most of the changes are renamings, commenting, and reformatting.

Besides that ...

- I converted the native agent of the test IterateHeapWithEscapeAnalysisEnabled
  from C to C++, because this seems to be preferred by serviceability
  developers. I also re-indented the file, but excluded this from the delta
  webrev.

- I had to adapt test/jdk/com/sun/jdi/EATests.java to the fact that background
  compilation (-Xbatch) cannot be reliably disabled for JVMCI
  compilers. E.g. the compile broker will compile in the background if JVMCI is
  not yet fully initialized. Therefore it is possible that test cases are
  executed before the main test method is compiled on the highest level and then
  the test case fails. The higher the system load the higher the probability for
  this to happen. In webrev.7 I skip the compilation level check if the vm is
  configured to use the JVMCI compiler.

I also answered you inline below.

Thanks,
Richard.

-Original Message-
From: Lindenmaier, Goetz  
Sent: Donnerstag, 23. Juli 2020 16:20
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard, 

Thanks for your two further explanations in the other thread. 
That made the points clear to me.

> > I was not that happy with the names saying not_global_escape
> > and similar. I now agreed you have to use the terms of the escape
> > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not 
> > happy with
> > the 'not' in the term, I always try to expand the name to some
> > sentence with a negated verb, but it makes no sense.
> > For example, "has_not_global_escape_in_scope" expands to
> > "Hasn't a global escape in its scope." in my thinking, which makes
> > no sense. You probably mean
> > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape}
> > in its scope."
> 
> > C2 is using the word "non" in this context, e.g., here
> > alloc->is_non_escaping.
> 
> There is also ConnectionGraph::not_global_escape()
That talks about a single node that represents a single 
Object. An object has a single state wrt. ea.
You use the term for safepoint which tracks a set of objects.
Here, has_not_global_excape can mean
  1. None of the several objects does escape globaly.
  2. There is at least one object that escapes globaly.

> > non obviously negates the adjective 'global',
> > non-global or nonglobal even is a English term I find in the
> > net.
> > So what about "has_non_global_escape_in_scope?"
> 
> And what about has_ea_local_in_scope?
That's good. Please document somewhere that 
Ea_local == ArgEscape | NoEscape.
That's what it is, right?

> > Does jvmti specify that the same limits are used ...?
> > ok on your side.
> 
> I don't know and didn't find anything in a quick search.
Ok, not your business.

> 
> > jvmtiEnvBase.cpp  ok
> > jvmtiImpl.h|cpp  ok
> > jvmtiTagMap.cpp ok
> > whitebox.cpp ok
> 
> > deoptimization.cpp
> 
> > line 177: Please break line
> > line 246, 281: Please break line
> > 1578, 1583, 1589, 1632, 1649, 1651 Break line
> 
> > 1651: You use 'non'-terms, too: non-escaping :)
> 
> I know :) At least here it is wrong I'd say. "...has to be a not escaping 
> obj..."
> sounds better
> (hopefully not only to my german ears).
I thought the term non-escpaing makes it quite clear.
I just wanted to point out that using non above would
be similar to the wording here.

> > IterateHeapWithEscapeAnalysisEnabled.java
> 
> > line 415:
> > msg("wait until target thread has set testMethod_result");
> > while (testMethod_result == 0) {
> > Thread.sleep(50);
> > }
> > Might the test run into timeouts at this place?
> > The field is volatile, i.e. it will be reloaded
> > in each iteration. But will dontinline_testMethod
> > write it back to main memory in time?
> 
> You mean, the test could hang in that loop for a couple of minutes? I don't
> think so. There are cache coherence protocols in place which will invalidate
> stale data very timely.
Ok, anyways, it would only be a hanging test.
> 
> Ok. I've removed quite a lot of the occurrances.
> 
> > Also, I like full sentences in comments.
> > Especially for me as foreign speaker, this makes
> > things much more clear. I.e., I try to make it
> > a real sentence with articles, capitalized and a
> > dot at the end if there is a subject and a verb
> > in first place.
> > E.g., jvmtiEnvBase.cpp:1327
> 
> Are you referring to the following?
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hots
> pot/share/prims/jvmtiEnvBase.cpp.frames.html)
> 
> 1326
> 1327   // If the frame is a compiled one, need to 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-23 Thread Lindenmaier, Goetz
Hi Richard, 

Thanks for your two further explanations in the other thread. 
That made the points clear to me.

> > I was not that happy with the names saying not_global_escape
> > and similar. I now agreed you have to use the terms of the escape
> > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not 
> > happy with
> > the 'not' in the term, I always try to expand the name to some
> > sentence with a negated verb, but it makes no sense.
> > For example, "has_not_global_escape_in_scope" expands to
> > "Hasn't a global escape in its scope." in my thinking, which makes
> > no sense. You probably mean
> > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape}
> > in its scope."
> 
> > C2 is using the word "non" in this context, e.g., here
> > alloc->is_non_escaping.
> 
> There is also ConnectionGraph::not_global_escape()
That talks about a single node that represents a single 
Object. An object has a single state wrt. ea.
You use the term for safepoint which tracks a set of objects.
Here, has_not_global_excape can mean
  1. None of the several objects does escape globaly.
  2. There is at least one object that escapes globaly.

> > non obviously negates the adjective 'global',
> > non-global or nonglobal even is a English term I find in the
> > net.
> > So what about "has_non_global_escape_in_scope?"
> 
> And what about has_ea_local_in_scope?
That's good. Please document somewhere that 
Ea_local == ArgEscape | NoEscape.
That's what it is, right?

> > Does jvmti specify that the same limits are used ...?
> > ok on your side.
> 
> I don't know and didn't find anything in a quick search.
Ok, not your business.

> 
> > jvmtiEnvBase.cpp  ok
> > jvmtiImpl.h|cpp  ok
> > jvmtiTagMap.cpp ok
> > whitebox.cpp ok
> 
> > deoptimization.cpp
> 
> > line 177: Please break line
> > line 246, 281: Please break line
> > 1578, 1583, 1589, 1632, 1649, 1651 Break line
> 
> > 1651: You use 'non'-terms, too: non-escaping :)
> 
> I know :) At least here it is wrong I'd say. "...has to be a not escaping 
> obj..."
> sounds better
> (hopefully not only to my german ears).
I thought the term non-escpaing makes it quite clear.
I just wanted to point out that using non above would
be similar to the wording here.

> > IterateHeapWithEscapeAnalysisEnabled.java
> 
> > line 415:
> > msg("wait until target thread has set testMethod_result");
> > while (testMethod_result == 0) {
> > Thread.sleep(50);
> > }
> > Might the test run into timeouts at this place?
> > The field is volatile, i.e. it will be reloaded
> > in each iteration. But will dontinline_testMethod
> > write it back to main memory in time?
> 
> You mean, the test could hang in that loop for a couple of minutes? I don't
> think so. There are cache coherence protocols in place which will invalidate
> stale data very timely.
Ok, anyways, it would only be a hanging test.
> 
> Ok. I've removed quite a lot of the occurrances.
> 
> > Also, I like full sentences in comments.
> > Especially for me as foreign speaker, this makes
> > things much more clear. I.e., I try to make it
> > a real sentence with articles, capitalized and a
> > dot at the end if there is a subject and a verb
> > in first place.
> > E.g., jvmtiEnvBase.cpp:1327
> 
> Are you referring to the following?
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hots
> pot/share/prims/jvmtiEnvBase.cpp.frames.html)
> 
> 1326
> 1327   // If the frame is a compiled one, need to deoptimize it.
> 1328   if (vf->is_compiled_frame()) {
> 
> This line 1327 is preexisting.
Sorry, wrong line number again. 
I think I meant
1333 // eagerly reallocate scalar replaced objects.

But I must admit, the subject is missing. It's one of these 
imperative sentences where the subject is left out, which 
are used throughout documentation.
Bad example, but still a correct sentence, so qualifies 
for punctuation?

Best regards,
  Goetz.





RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-22 Thread Reingruber, Richard
Hi Goetz,

> Thanks for the quick reply.

Yes, this time it didn't take that long...

[... snip ...]

> > > > > I understand you annotate at safepoints where the escape analysis
> > > > > finds out that an object is "better" than global escape.
> > > > > This are the cases where the analysis identifies optimization
> > > > > opportunities. These annotations are then used to deoptimize
> > > > > frames and the objects referenced by them.
> > > > > Doesn't this overestimate the optimized
> > > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > > out.
> > > >
> > > > Yes, the implementation is conservative, but it is comparatively simple
> > and
> > > > the additional debug
> > > > info is just 2 flags per safepoint.
> > > Thanks. It also helped that you explained to me offline that
> > > there are more optimizations than only lock elimination and scalar
> > > replacement done based on the ea information.
> > > The ea refines the IR graph with allows follow up optimizations
> > > which can not easily be tracked back to the escaping objects or
> > > the call sites where they do not escape.
> > > Thus, if there are non-global escaping objects, you have to
> > > deoptimize the frame.
> > > Did I repeat that correctly?
> > 
> > Mostly, but there are also cases where deoptimization is required if and 
> > only
> > if ea-local objects
> > are passed as arguments. This is the case when values are not read directly
> > from a frame, but from a callee frame.
> Hmm, don't get this completely, but ok.

Let C be a callee frame of B which is a callee of A. If you use JVMTI to read an
object reference from a local variable of C then the implementation of
JDK-8227745 deoptimizes A if it passes any ea-local as argument, because the
reference could be ea-local in A and there might be optimizations that are
invalid after the escape state change.
  
> > > > Accesses to instance
> > > > members or array elements can be optimized as well.
> > > You mean the compiler can/will ignore volatile or memory ordering
> > > requirements for non-escaping objects? Sounds reasonable to do.
> > 
> > Yes, for instance. Also without volatile modifiers it will eliminate 
> > accesses.
> > Here is an example:
> > Method A has a NoEscape allocation O that is not scalar replaced. A calls
> > Method B, which is not
> > inlined. When you use your debugger to break in B, then modify a field of O,
> > then this modification
> > would have no effect without deoptimization, because the jit assumes that B
> > cannot modify O without
> > a reference to it.
> Yes, A can keep O in a register, while the JVMTI thread would write to 
> the location in the stack where the local is held (if it was written back).

Not quite. It is the value of the field of O that is in a register not the
reference to O itself. The agent changes the field's value in the /java heap/
(remember: O is _not_ scalar replaced), but the fields value is not reloaded
after return from B.

> > > > > Syncronization: looks good. I think others had a look at this before.
> > > > >
> > > > > EscapeBarrier::deoptimize_objects_internal()
> > > > >   The method name is misleading, it is not used by
> > > > >   deoptimize_objects().
> > > > >   Also, method with the same name is in Deopitmization.
> > > > >   Proposal: deoptimize_objects_thread() ?
> > > >
> > > > Sorry, but I don't see, why it would be misleading.
> > > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > > understand that name.
> > > 1. I have no idea why it's called "_internal". Because it is private?
> > >By the name, I would expect that EscapeBarrier::deoptimize_objects()
> > >calls it for some internal tasks. But it does not.
> > 
> > Well, I'd say it is pretty internal, what's happening in that method. So 
> > IMHO
> > the suffix _internal
> > is a match.
> > 
> > > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > > That's how I would have named it.
> > > But no bike shedding, if you don't see what I mean it's not obvious.
> > Ok. We could have a quick call, too, if you like.

> Ok, I think I have understood the remaining points.  I'm fine with this 
> so far.

Thanks again and best regards,
Richard.

-Original Message-
From: Lindenmaier, Goetz  
Sent: Mittwoch, 22. Juli 2020 18:22
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

Thanks for the quick reply.

> > >   With DeoptimizeObjectsALot enabled internal threads are started that
> > > deoptimize frames and
> > >   objects. The number of threads started are given with
> > > DeoptimizeObjectsALotThreadCountAll and
> > >   DeoptimizeObjectsALotThreadCountSingle. The former targets all
> existing
> > > 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-22 Thread Reingruber, Richard
Hi Goetz,

> > I'll answer to the obvious things in this mail now.
> > I'll go through the code thoroughly again and write
> > a review of my findings thereafter.
> As promised a detailed walk-throug, but without any major findings:

> c1_IR.hpp: ok
> ci_Env.h|cpp: ok
> compiledMethod.cpp, nmethod.cpp: ok
> debugInfoRec.h|cpp: ok
> scopeDesc.h|cpp ok

> compileBroker.h|cpp: 
> Maybe a bit of documentation how and why you start 
> the threads? I had expected there are two test
> scenarios run after each other, but now I understand 'Single'
> and 'All' run simultaneously.  Well, this really is a stress test!
> Also good the two variants of depotimization are
> stressed against each other.
> Besides that really nice it's all in one place.

Done.

> rootResolver.cpp: ok
> jvmciCodeInstaller.cpp: ok

> c2compiler.cpp: The essence of this change! Just one line :)
> Great!

:)

> callnode.hpp ok
> escape.h|cpp ok
> macro.cpp 
> I was not that happy with the names saying not_global_escape
> and similar. I now agreed you have to use the terms of the escape
> analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not 
> happy with 
> the 'not' in the term, I always try to expand the name to some
> sentence with a negated verb, but it makes no sense.
> For example, "has_not_global_escape_in_scope" expands to 
> "Hasn't a global escape in its scope." in my thinking, which makes 
> no sense. You probably mean
> "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
> in its scope."

> C2 is using the word "non" in this context, e.g., here 
> alloc->is_non_escaping.

There is also ConnectionGraph::not_global_escape()

> non obviously negates the adjective 'global',
> non-global or nonglobal even is a English term I find in the 
> net. 
> So what about "has_non_global_escape_in_scope?"

And what about has_ea_local_in_scope?

> matcher.cpp ok

> output.cpp:1071
> Please break the long line.

Done.

> jvmtiCodeBlobEvents.cpp ok

> jvmtiEnv.cpp
> MaxJavaStackTraceDepth is only documented to affect
> the exceptions stack trace depth, not to limit jvmti 
> operations. Therefore I wondered why it is used here. 
> Non of your business, but the flag should
> document this in globals.hpp, too.  
> Does jvmti specify that the same limits are used ...?
> ok on your side.

I don't know and didn't find anything in a quick search.

> jvmtiEnvBase.cpp  ok
> jvmtiImpl.h|cpp  ok
> jvmtiTagMap.cpp ok
> whitebox.cpp ok

> deoptimization.cpp

> line 177: Please break line
> line 246, 281: Please break line
> 1578, 1583, 1589, 1632, 1649, 1651 Break line

> 1651: You use 'non'-terms, too: non-escaping :)

I know :) At least here it is wrong I'd say. "...has to be a not escaping 
obj..." sounds better
(hopefully not only to my german ears).

> 2805, 2929, 2946ff, break lines

> deoptimization.hpp

> 158, 174, 176 ... I would break lines too, but here you are in
> good company :)

Done.

> globals.hpp ok
> mutexLocker.h|cpp ok
> objectMonitor.cpp ok

> thread.cpp 

> 2631 typo: sapfepont --> safepoint

Done.

> thread.hpp ok
> thread.inline.hpp ok
> vframe.cpp ok
> vframe_hp.cpp   458ff break lines
> vframe_hp.hpp ok
> macros.hpp ok
> TEST.ROOT ok
> WhiteBox.java ok

> IterateHeapWithEscapeAnalysisEnabled.java

> line 415:
> msg("wait until target thread has set testMethod_result");
> while (testMethod_result == 0) {
> Thread.sleep(50);
> }
> Might the test run into timeouts at this place?
> The field is volatile, i.e. it will be reloaded
> in each iteration. But will dontinline_testMethod
> write it back to main memory in time?

You mean, the test could hang in that loop for a couple of minutes? I don't
think so. There are cache coherence protocols in place which will invalidate
stale data very timely.

> libIterateHeapWithEscapeAnalysisEnabled.c ok

> EATests.java

> This is a very elaborate test.
> I found a row of test cases illustrating issues
> we talked about before. Really helpful!

> 1311: TypeO materialize -> materialized

Found and fix typo at line 1369.
(Probably the cursor was on 1311 and your eyes on 1369 ;))

> 1640: setting local variable i triggers always deoptimization
>   --> setting local variable i always triggers deoptimization

Fixed.

> 2176: dontinline_calee --> dontinline_callee
> 2510: poping --> popping  ... but I'm not sure here.

Done.

> https://www.urbandictionary.com/define.php?term=poping
> poping
> Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough 
> syrup, and then embarking on an adventure while wandering around 
> neighborhoods or parks all night. This is usually done while listening to 
> Punk rock music from a portable jambox. 
> ;)
> Don’t do it! 

OMG! How come you know?! ;)

> EATestsJVMTI.java

> I think you can just copy this test description into the other
> test. You can have two @test comments, they will be treated
> as separate tests.  The @requires will be evaluated accordingly.
> For an example see 
> 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-22 Thread Lindenmaier, Goetz
Hi Richard,

Thanks for the quick reply.

> > >   With DeoptimizeObjectsALot enabled internal threads are started that
> > > deoptimize frames and
> > >   objects. The number of threads started are given with
> > > DeoptimizeObjectsALotThreadCountAll and
> > >   DeoptimizeObjectsALotThreadCountSingle. The former targets all
> existing
> > > threads whereas the
> > >   latter operates on a single thread selected round robin.
> > >
> > >   I removed the mode where deoptimizations were performed at every nth
> > > exit from the runtime. I never used it.
> 
> > Do I get it right? You have a n:1 and a n:all test scenario.
> >  n:1: n threads deoptimize 1 Jana threadwhere n => DOALThreadCountSingle
> >  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?
> 
> Not quite.
> 
> -XX:+DeoptimizeObjectsALot // required
> -XX:DeoptimizeObjectsALotThreadCountAll=m
> -XX:DeoptimizeObjectsALotThreadCountSingle=n
> 
> Will start m+n threads. Each operating on all existing JavaThreads using
> EscapeBarriers. The
> difference between the 2 thread types is that one distinct EscapeBarrier
> targets either just a
> single thread or all exisitng threads at onece. If just one single thread is
> targeted per
> EscapeBarrier, then it is not always the same thread, but threads are selected
> round robin. So there
> will be n threads selecting independently single threads round robin per
> EscapeBarrier and m threads
> that target all threads in every EscapeBarrier.
Ok, yes, that is how I understood it. 
 
> > > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > > execute it always independently
> > >   of is_thread_fully_suspended().
> > Is this also a performance optimization?
> 
> Maybe a minor one.
OK

> > > * JavaThread::wait_for_object_deoptimization():
> > >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > > safepoint check! This
> > > caused issues with not walkable stacks with DeoptimizeObjectsALot.
> > OK. As I understand, there was one safepoint check in the old version,
> > now there is one in each iteration.  I assume this is intended, right?
> 
> Yes it is. The important thing here is (A) a safepoint check is needed /after/
> leaving a safe state
> (_thread_in_native, _thread_blocked). (B) Shared variables that are modified
> at safepoints or with handshakes need to be reread /after/ the safepoint 
> check.
> 
> BTW: I only noticed now that since JDK-8240918 JavaThreads themselves
> must disarm their polling
> page. Originally (before handshakes) this was done by the VM thread. With
> handshakes it was done by
> the thread executing the handshake op. This was changed for
> OrderAccess::cross_modify_fence() where
> the poll is left armed if the thread is in native and sice JDK-8240918 it is
> always left armed. So
> when a thread leaves a safe state (native, blocked) and there was a
> handshake/vm op, it will always
> call SafepointMechanism::block_if_requested_slow(), even if the
> handshake/vm operation have been
> processed already and everybody else is happyly executing bytecodes :)
Ok.

> Still (A) and (B) hold.

> > >   - Added limited spinning inspired by HandshakeSpinYield to fix 
> > > regression in
> > > microbenchmark [1]
> > Ok.  Nice improvement, nice catch!
> 
> Yes. It certainly took some time to find out.
> 
> > >
> > > I refer to some more changes answering your questions and comments
> inline
> > > below.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > [1] Microbenchmark:
> > >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> > >
> 
> 
> > > > I understand you annotate at safepoints where the escape analysis
> > > > finds out that an object is "better" than global escape.
> > > > This are the cases where the analysis identifies optimization
> > > > opportunities. These annotations are then used to deoptimize
> > > > frames and the objects referenced by them.
> > > > Doesn't this overestimate the optimized
> > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > out.
> > >
> > > Yes, the implementation is conservative, but it is comparatively simple
> and
> > > the additional debug
> > > info is just 2 flags per safepoint.
> > Thanks. It also helped that you explained to me offline that
> > there are more optimizations than only lock elimination and scalar
> > replacement done based on the ea information.
> > The ea refines the IR graph with allows follow up optimizations
> > which can not easily be tracked back to the escaping objects or
> > the call sites where they do not escape.
> > Thus, if there are non-global escaping objects, you have to
> > deoptimize the frame.
> > Did I repeat that correctly?
> 
> Mostly, but there are also cases where deoptimization is required if and only
> if ea-local objects
> are passed as arguments. This is the case when values are not read directly
> from a frame, but from a callee frame.
Hmm, don't get this completely, but ok.
  

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-22 Thread Reingruber, Richard
Hi Goetz,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write 
> a review of my findings thereafter.

Sure. If trimmed my citations to relevant parts.

> > The delta includes many changes in comments, renaming of names, etc. So
> > I'd like to summarize
> > functional changes:
> > 
> > * Collected all the code for the testing feature DeoptimizeObjectsALot in
> > compileBroker.cpp and reworked it.
> Thanks, this makes it much more compact.

> >   With DeoptimizeObjectsALot enabled internal threads are started that
> > deoptimize frames and
> >   objects. The number of threads started are given with
> > DeoptimizeObjectsALotThreadCountAll and
> >   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> > threads whereas the
> >   latter operates on a single thread selected round robin.
> > 
> >   I removed the mode where deoptimizations were performed at every nth
> > exit from the runtime. I never used it.

> Do I get it right? You have a n:1 and a n:all test scenario.
>  n:1: n threads deoptimize 1 Jana threadwhere n = DOALThreadCountSingle
>  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

Not quite.

-XX:+DeoptimizeObjectsALot // required
-XX:DeoptimizeObjectsALotThreadCountAll=m
-XX:DeoptimizeObjectsALotThreadCountSingle=n

Will start m+n threads. Each operating on all existing JavaThreads using 
EscapeBarriers. The
difference between the 2 thread types is that one distinct EscapeBarrier 
targets either just a
single thread or all exisitng threads at onece. If just one single thread is 
targeted per
EscapeBarrier, then it is not always the same thread, but threads are selected 
round robin. So there
will be n threads selecting independently single threads round robin per 
EscapeBarrier and m threads
that target all threads in every EscapeBarrier.


> > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > execute it always independently
> >   of is_thread_fully_suspended().
> Is this also a performance optimization?

Maybe a minor one.

> > * JavaThread::wait_for_object_deoptimization():
> >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > safepoint check! This
> > caused issues with not walkable stacks with DeoptimizeObjectsALot.
> OK. As I understand, there was one safepoint check in the old version, 
> now there is one in each iteration.  I assume this is intended, right?

Yes it is. The important thing here is (A) a safepoint check is needed /after/ 
leaving a safe state
(_thread_in_native, _thread_blocked). (B) Shared variables that are modified at 
safepoints or with
handshakes need to be reread /after/ the safepoint check.

BTW: I only noticed now that since JDK-8240918 JavaThreads themselves must 
disarm their polling
page. Originally (before handshakes) this was done by the VM thread. With 
handshakes it was done by
the thread executing the handshake op. This was change for 
OrderAccess::cross_modify_fence() where
the poll is left armed if the thread is in native and sice JDK-8240918 it is 
always left armed. So
when a thread leaves a safe state (native, blocked) and there was a 
handshake/vm op, it will always
call SafepointMechanism::block_if_requested_slow(), even if the handshake/vm 
operation have been
processed already and everybody else is happyly executing bytecodes :)

Still (A) and (B) hold.

> >   - Added limited spinning inspired by HandshakeSpinYield to fix regression 
> > in
> > microbenchmark [1]
> Ok.  Nice improvement, nice catch!

Yes. It certainly took some time to find out.

> > 
> > I refer to some more changes answering your questions and comments inline
> > below.
> > 
> > Thanks,
> > Richard.
> > 
> > [1] Microbenchmark:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> > 


> > > I understand you annotate at safepoints where the escape analysis
> > > finds out that an object is "better" than global escape.
> > > This are the cases where the analysis identifies optimization
> > > opportunities. These annotations are then used to deoptimize
> > > frames and the objects referenced by them.
> > > Doesn't this overestimate the optimized
> > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > out.
> > 
> > Yes, the implementation is conservative, but it is comparatively simple and
> > the additional debug
> > info is just 2 flags per safepoint. 
> Thanks. It also helped that you explained to me offline that 
> there are more optimizations than only lock elimination and scalar
> replacement done based on the ea information.
> The ea refines the IR graph with allows follow up optimizations 
> which can not easily be tracked back to the escaping objects or 
> the call sites where they do not escape. 
> Thus, if there are non-global escaping objects, you have to 
> deoptimize the frame.
> Did I repeat that correctly?

Mostly, but there are also cases, where 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-17 Thread Lindenmaier, Goetz
Hi Richard,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write
> a review of my findings thereafter.
As promised a detailed walk-throug, but without any major findings:

c1_IR.hpp: ok
ci_Env.h|cpp: ok
compiledMethod.cpp, nmethod.cpp: ok
debugInfoRec.h|cpp: ok
scopeDesc.h|cpp ok

compileBroker.h|cpp: 
Maybe a bit of documentation how and why you start 
the threads? I had expected there are two test
scenarios run after each other, but now I understand 'Single'
and 'All' run simultaneously.  Well, this really is a stress test!
Also good the two variants of depotimization are
stressed against each other.
Besides that really nice it's all in one place.

rootResolver.cpp: ok
jvmciCodeInstaller.cpp: ok

c2compiler.cpp: The essence of this change! Just one line :)
Great!

callnode.hpp ok
escape.h|cpp ok
macro.cpp 
I was not that happy with the names saying not_global_escape
and similar. I now agreed you have to use the terms of the escape
analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy 
with 
the 'not' in the term, I always try to expand the name to some
sentence with a negated verb, but it makes no sense.
For example, "has_not_global_escape_in_scope" expands to 
"Hasn't a global escape in its scope." in my thinking, which makes 
no sense. You probably mean
"Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
in its scope."

C2 is using the word "non" in this context, e.g., here 
alloc->is_non_escaping.

non obviously negates the adjective 'global',
non-global or nonglobal even is a English term I find in the 
net. 
So what about "has_non_global_escape_in_scope?"

matcher.cpp ok

output.cpp:1071
Please break the long line.

jvmtiCodeBlobEvents.cpp ok

jvmtiEnv.cpp
MaxJavaStackTraceDepth is only documented to affect
the exceptions stack trace depth, not to limit jvmti 
operations. Therefore I wondered why it is used here. 
Non of your business, but the flag should
document this in globals.hpp, too.  
Does jvmti specify that the same limits are used ...?
ok on your side.

jvmtiEnvBase.cpp  ok
jvmtiImpl.h|cpp  ok
jvmtiTagMap.cpp ok
whitebox.cpp ok

deoptimization.cpp

line 177: Please break line
line 246, 281: Please break line
1578, 1583, 1589, 1632, 1649, 1651 Break line

1651: You use 'non'-terms, too: non-escaping :)

2805, 2929, 2946ff, break lines

deoptimization.hpp

158, 174, 176 ... I would break lines too, but here you are in
good company :)

globals.hpp ok
mutexLocker.h|cpp ok
objectMonitor.cpp ok

thread.cpp 

2631 typo: sapfepont --> safepoint

thread.hpp ok
thread.inline.hpp ok
vframe.cpp ok
vframe_hp.cpp   458ff break lines
vframe_hp.hpp ok
macros.hpp ok
TEST.ROOT ok
WhiteBox.java ok

IterateHeapWithEscapeAnalysisEnabled.java

line 415:
msg("wait until target thread has set testMethod_result");
while (testMethod_result == 0) {
Thread.sleep(50);
}
Might the test run into timeouts at this place?
The field is volatile, i.e. it will be reloaded
in each iteration. But will dontinline_testMethod
write it back to main memory in time?

libIterateHeapWithEscapeAnalysisEnabled.c ok

EATests.java

This is a very elaborate test.
I found a row of test cases illustrating issues
we talked about before. Really helpful!

1311: TypeO materialize -> materialized

1640: setting local variable i triggers always deoptimization
  --> setting local variable i always triggers deoptimization

2176: dontinline_calee --> dontinline_callee
2510: poping --> popping  ... but I'm not sure here.

https://www.urbandictionary.com/define.php?term=poping
poping
Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, 
and then embarking on an adventure while wandering around neighborhoods or 
parks all night. This is usually done while listening to Punk rock music from a 
portable jambox. 
;)
Don’t do it! 

EATestsJVMTI.java

I think you can just copy this test description into the other
test. You can have two @test comments, they will be treated
as separate tests.  The @requires will be evaluated accordingly.
For an example see 
test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
which has two different compile setups for the test class (-g).

so, that's it for reading code ...


Some general remarks, maybe a bit picky ...:
I think you could use less commas ',' in comments.
As I understand, you need a comma if the relative
sentence is at the beginning, but not if it is at 
the end:
  If Corona is over, I go to the office.
but
  I go to the office if Corona is over.
I think the same holds for 'because', 'while' etc.
E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
vframe_hp.hpp 104ff

Also, I like full sentences in comments.  
Especially for me as foreign speaker, this makes
things much more clear. I.e., I try to make it
a real sentence with articles, capitalized and a
dot at the end if there is a subject and a verb
in first place.
E.g., jvmtiEnvBase.cpp:1327
In many 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-07-16 Thread Lindenmaier, Goetz
Hi Richard, 

I'll answer to the obvious things in this mail now.
I'll go through the code thoroughly again and write 
a review of my findings thereafter.

> So here is the new webrev.6
> 
> Webrev.6:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/
Thanks for the incremental webrev, it's helpful!
 
> I spent most of the time running a microbenchmark [1] I wrote to answer
> questions from your
> review. At first I had trouble with variance in the results until I found out 
> it
> was due to the NUMA
> architecture of the server I used. After that I noticed that there was a
> performance regression of
> about 5% even at low agent activity. I finally found out that it was due to 
> the
> implementation of
> JavaThread::wait_for_object_deoptimization() which is called by the target
> of the JVMTI operation to
> self suspend for object deoptimization. I fixed this by adding limited 
> spinning
> before calling
> wait() on the monitor.
> 
> The delta includes many changes in comments, renaming of names, etc. So
> I'd like to summarize
> functional changes:
> 
> * Collected all the code for the testing feature DeoptimizeObjectsALot in
> compileBroker.cpp and reworked it.
Thanks, this makes it much more compact.

>   With DeoptimizeObjectsALot enabled internal threads are started that
> deoptimize frames and
>   objects. The number of threads started are given with
> DeoptimizeObjectsALotThreadCountAll and
>   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> threads whereas the
>   latter operates on a single thread selected round robin.
> 
>   I removed the mode where deoptimizations were performed at every nth
> exit from the runtime. I never used it.

Do I get it right? You have a n:1 and a n:all test scenario.
 n:1: n threads deoptimize 1 Jana threadwhere n = DOALThreadCountSingle
 n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

> * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> execute it always independently
>   of is_thread_fully_suspended().
Is this also a performance optimization?

> * Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found
> this testing with DeoptimizeObjectsALot.
Ok.

> * Added EscapeBarrier::thread_removed().
Ok.

> * EscapeBarrier constructors: barriers can now be entirely disabled by
> disabling DoEscapeAnalysis.
>   This effectively disables the enhancement.
Good!

> * JavaThread::wait_for_object_deoptimization():
>   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> safepoint check! This
> caused issues with not walkable stacks with DeoptimizeObjectsALot.
OK. As I understand, there was one safepoint check in the old version, 
now there is one in each iteration.  I assume this is intended, right?

>   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> microbenchmark [1]
Ok.  Nice improvement, nice catch!

> 
> I refer to some more changes answering your questions and comments inline
> below.
> 
> Thanks,
> Richard.
> 
> [1] Microbenchmark:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> 


> > I understand you annotate at safepoints where the escape analysis
> > finds out that an object is "better" than global escape.
> > This are the cases where the analysis identifies optimization
> > opportunities. These annotations are then used to deoptimize
> > frames and the objects referenced by them.
> > Doesn't this overestimate the optimized
> > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > out.
> 
> Yes, the implementation is conservative, but it is comparatively simple and
> the additional debug
> info is just 2 flags per safepoint. 
Thanks. It also helped that you explained to me offline that 
there are more optimizations than only lock elimination and scalar
replacement done based on the ea information.
The ea refines the IR graph with allows follow up optimizations 
which can not easily be tracked back to the escaping objects or 
the call sites where they do not escape. 
Thus, if there are non-global escaping objects, you have to 
deoptimize the frame.
Did I repeat that correctly?
With this understanding, a row of my proposed renamings/comments
are obsolete.


> On the other hand, those JVMTI operations
> that really trigger
> deoptimizations are expected to be comparatively infrequent such that
> switching to the interpreter
> for a few microseconds will hardly have an effect.
That sounds reasonable.

> I've done microbenchmarking to check this.
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> 
> I found that in the worst case performance can be impacted by 10%. If the
> agent is extremely active
> and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> millisecond or more often,
> then the performance impact can be 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-05-06 Thread Lindenmaier, Goetz
Hi Richard,

I had a look at your change.  It's complex, but not that big.
A lot of code is just passing info through layers of abstraction.
Also, one can tell this went through some iterations by now, 
I think it's very well engineered.
I had a look at webrev.05

Unfortunately
"8242425: JVMTI monitor operations should use Thread-Local Handshakes" 
breaks webrev.05.
I updated to before that change and took that as base of my review.

I see four parts of the change that can be looked at
rather individually.

 * Refactoring the scopeDesc constructors. Trivial.
 * Persisting information about the optimizations done by the compilers.
   Large and mostly trivial.
 * Deoptimizing. The most complicated part. Really well abstracted, though.
 * DeoptimizeObjectsALot for testing and the tests.

Review of compiler changes:

I understand you annotate at safepoints where the escape analysis
finds out that an object is "better" than global escape. 
This are the cases where the analysis identifies optimization 
opportunities. These annotations are then used to deoptimize
frames and the objects referenced by them.
Doesn't this overestimate the optimized 
objects?  E.g., eliminate_alloc_node has many cases where it bails
out.

c1_IR.hpp   

OK, nothing to do for C1, just adapt to extended method signature.

Break line once more so that it matches above line length.


ciEnv.h|cpp

Pass through another jvmti capability.  Trivial & good.


debugInfoRec.hpp

Pass through escape info that must be recorded. OK.

pcDesc.hpp

I would like to see some documentation of the methods.

Maybe:
  // There is an object in the scope that does not escape globally.
  // It either does not escape at all or it escapes as arguemnt.
and
  // One of the arguments is an object that is not globally visible
  // but escapes to the callee.

scopeDesc.cpp

  Besides refactoring copy escape info from pcDesc to scopeDesc
  and add accessors. Trivial.

  In scopeDesc.hpp you talk about NoEscape and ArgEscape. 
  This are opto terms, but scopeDesc is a shared datastructure
  that does not depend on a specific compiler. 
  Please explain what is going on without using these terms.

jvmciCodeInstaller.cpp

  OK, nothing for JVMCI. Here support for Object Optimizations 
  for JVMCI compilers could be added. Leave this to graal people.

callnode.hpp

You add functionality to annotate callnodes with escape information 
This is carried through code generation to final output where it is
added to the compiled methods meta information.

At Safepoints in general jvmti can access
  - Objects that were scalar replaced. They must be reallocated.
(Flag EliminateAllocations)
  - Objects that should be locked but are not because they never 
escape the thread. They need to be relocked.

At calls, Objects where locks have been removed escape to callees.
We must persist this information so that if jvmti accesses the 
object in a callee, we can determine by looking at the caller that
it needs to be relocked.

A side comment: 
I think the flage handling in Opto is not very intuitive.
DoEscapeAnalysis depends on the jvmti capabilities.
This makes no sense. It is only an analysis. The optimizations
should depend on the jvmti capabilities.
The correct setup would be to handle this in 
CompilerConfig::ergo_initialize():
If the jvmti capabilities allow, enable the optimizations 
EliminateAllocations or  EliminateLocks/EliminateNestedLocks.
If one of these optimizations is on, enable EscapeAnalysis.
 -- end side comment.

So I would propose the following comments:

  // In the scope of this safepoints there are objects
  // that do not globally escape. They are either NoEscape or
  // ArgEscape. As such, they might be subject to optimizations.
  // Persist this information here so that the frame an the
  // Objects in scope can 
  // be deoptimized if jvmti accesses an object at this safepoint.
  void set_not_global_escape_in_scope(bool b) {

  // This call passes objects that do not globally escape 
  // to its callee. The object might be subject to optimization, 
  // e.g. a lock might be omitted. Persist this information here 
  // so that on a jvmti access to the callee frame we can deoptimize
  // the object and this frame.
  void  set_arg_escape(bool f) { _arg_escape = f; }

Actuall I am not sure whether the name of these fields (and all 
the others in the course of this change) should refer to 
escape analysis.  I think the term "Object deoptimization" 
you also use is much better. You could call these properties 
(througout the whole change) 
  set_optimized_objects_in_scope()
and
  set_passes_optimized_objects().

I think this would make the whole matter much easier
to understand. 

Anyways, locks can already be removed without running
escape analysis at all. C2 recognizes some local patterns
that allow this.

escape.h|cpp

The code looks good. 

Line 325: The comment could be a bit more elaborate:
  // Annotate at safepoints if they have <= ArgEscape 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-04-01 Thread Reingruber, Richard
  > Thanks for cleaning up thread.hpp!

Thanks for providing the feedback!

I justed noticed that the forward declaration of class 
jvmtiDeferredLocalVariableSet is not required anymore. Will remove it in the 
next webrev. Hope to get some more (partial) reviews.

Thanks, Richard.

-Original Message-
From: Robbin Ehn  
Sent: Dienstag, 31. März 2020 16:21
To: Reingruber, Richard ; Doerr, Martin 
; Lindenmaier, Goetz ; David 
Holmes ; Vladimir Kozlov (vladimir.koz...@oracle.com) 
; serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Thanks for cleaning up thread.hpp!

/Robbin

On 2020-03-30 10:31, Reingruber, Richard wrote:
> Hi,
> 
> this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> 
> The change affects jvmti, hotspot and c2. Partial reviews are very welcome 
> too.
> 
> Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> 
> Robbin, Martin, please let me know, if anything shouldn't be quite as you 
> wanted it. Also find my
> comments on your feedback below.
> 
> Robbin, can I count you as Reviewer for the runtime part?
> 
> Thanks, Richard.
> 
> --
> 
>> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
>> You can move both declaration and definition to that file, no need to clobber
>> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Done.
> 
>> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's 
>> own
>> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting 
> jvmtiDeferredLocalVariableSet is
> declared.
> 
>> src/hotspot/share/code/compiledMethod.cpp
>> Nice cleanup!
> 
> Thanks :)
> 
>> src/hotspot/share/code/debugInfoRec.cpp
>> src/hotspot/share/code/debugInfoRec.hpp
>> Additional parmeters. (Remark: I think "non_global_escape_in_scope" would 
>> read better than "not_global_escape_in_scope", but your version is 
>> consistent with existing code, so no change request from my side.) Ok.
> 
> I've been thinking about this too and finally stayed with 
> not_global_escape_in_scope. It's supposed
> to mean an object whose escape state is not GlobalEscape is in scope.
> 
>> src/hotspot/share/compiler/compileBroker.cpp
>> src/hotspot/share/compiler/compileBroker.hpp
>> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a 
>> follow up change together with the test in order to make this webrev 
>> smaller, but since it is included, I'm reviewing everything at once. Not a 
>> big deal.) Ok.
> 
> Yes the change would be a little smaller. And if it helps I'll split it off. 
> In general I prefer
> patches that bring along a suitable amount of tests.
> 
>> src/hotspot/share/opto/c2compiler.cpp
>> Make do_escape_analysis independent of JVMCI capabilities. Nice!
> 
> It is the main goal of the enhancement. It is done for C2, but could be done 
> for JVMCI compilers
> with just a small effort as well.
> 
>> src/hotspot/share/opto/escape.cpp
>> Annotation for MachSafePointNodes. Your added functionality looks correct.
>> But I'd prefer to move the bulky code out of the large function.
>> I suggest to factor out something like has_not_global_escape and 
>> has_arg_escape. So the code could look like this:
>>SafePointNode* sfn = sfn_worklist.at(next);
>>sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
>>if (sfn->is_CallJava()) {
>>  CallJavaNode* call = sfn->as_CallJava();
>>  call->set_arg_escape(has_arg_escape(call));
>>}
>> This would also allow us to get rid of the found_..._escape_in_args 
>> variables making the loops better readable.
> 
> Done.
> 
>> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems 
>> to be the way to do it (there are more such places). So it's ok.
> 
> Yeah. I copied the snippet.
> 
>> src/hotspot/share/prims/jvmtiImpl.cpp
>> src/hotspot/share/prims/jvmtiImpl.hpp
>> The sequence is pretty complex:
>> VM_GetOrSetLocal element initialization executes EscapeBarrier code which 
>> suspends the target thread (extra VM Operation).
> 
> Note that the target threads have to be suspended already for 
> VM_GetOrSetLocal*. So it's mainly the
> synchronization effect of EscapeBarrier::sync_and_suspend_one() that is 
> required here. Also no extra
> _handshake_ is executed, since sync_and_suspend_one() will find the target 
> threads already
> suspended.
> 
>> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread 
>> to prepare VM Operation with frame deoptimization).
>> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which 
>> resumes the target thread.
>> But I don't have any improvement proposal. 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-04-01 Thread Reingruber, Richard
Hi Martin,

> thanks for addressing all my points. I've looked over webrev.5 and I'm 
> satisfied with your changes.

Thanks!

> I had also promised to review the tests.

Thanks++
I appreciate it very much, the tests are many lines of code.

> test/jdk/com/sun/jdi/EATests.java
> This is a substantial amount of tests which is appropriate for a such a large 
> change. Skipping some subtests with UseJVMCICompiler makes sense because it 
> doesn't provide the necessary JVMTI functionality, yet.
> Nice work!
> I also like that you test with and without BiasedLocking. Your tests will 
> still be fine after BiasedLocking deprecation.

Hope so :)

> Very minor nits:
> - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are 
> ommitted" (sounds funny)
> - You sometimes write "graal" and sometimes "Graal". I guess the capital G is 
> better. (Also in EATestsJVMCI.java.)

> test/jdk/com/sun/jdi/EATestsJVMCI.java
> EATests with Graal enabled. Nice that you support Graal to some extent. Maybe 
> Graal folks want to enhance them in the future. I think this is a good 
> starting point.

Will change this in the next webrev.

> Conclusion: Looks good and not trivial :-)
> Now, you have one full review. I'd be ok with covering 2nd review by partial 
> reviews.
> Compiler and JVMTI parts are not too complicated IMHO.
> Runtime part should get at least one additional careful review.

Thanks a lot,
Richard.

-Original Message-
From: Doerr, Martin  
Sent: Dienstag, 31. März 2020 16:01
To: Reingruber, Richard ; 'Robbin Ehn' 
; Lindenmaier, Goetz ; David 
Holmes ; Vladimir Kozlov (vladimir.koz...@oracle.com) 
; serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

thanks for addressing all my points. I've looked over webrev.5 and I'm 
satisfied with your changes.


I had also promised to review the tests.

test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
Thanks for updating the @summary comment. Looks good in webrev.5.

test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
JVMTI agent for object tagging and heap iteration. Good.

test/jdk/com/sun/jdi/EATests.java
This is a substantial amount of tests which is appropriate for a such a large 
change. Skipping some subtests with UseJVMCICompiler makes sense because it 
doesn't provide the necessary JVMTI functionality, yet.
Nice work!
I also like that you test with and without BiasedLocking. Your tests will still 
be fine after BiasedLocking deprecation.

Very minor nits:
- 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are 
ommitted" (sounds funny)
- You sometimes write "graal" and sometimes "Graal". I guess the capital G is 
better. (Also in EATestsJVMCI.java.)

test/jdk/com/sun/jdi/EATestsJVMCI.java
EATests with Graal enabled. Nice that you support Graal to some extent. Maybe 
Graal folks want to enhance them in the future. I think this is a good starting 
point.


Conclusion: Looks good and not trivial :-)
Now, you have one full review. I'd be ok with covering 2nd review by partial 
reviews.
Compiler and JVMTI parts are not too complicated IMHO.
Runtime part should get at least one additional careful review.

Best regards,
Martin


> -Original Message-
> From: Reingruber, Richard 
> Sent: Montag, 30. März 2020 10:32
> To: Doerr, Martin ; 'Robbin Ehn'
> ; Lindenmaier, Goetz
> ; David Holmes ;
> Vladimir Kozlov (vladimir.koz...@oracle.com)
> ; serviceability-dev@openjdk.java.net;
> hotspot-compiler-...@openjdk.java.net; hotspot-runtime-
> d...@openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> 
> The change affects jvmti, hotspot and c2. Partial reviews are very welcome
> too.
> 
> Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> 
> Robbin, Martin, please let me know, if anything shouldn't be quite as you
> wanted it. Also find my
> comments on your feedback below.
> 
> Robbin, can I count you as Reviewer for the runtime part?
> 
> Thanks, Richard.
> 
> --
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Done.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting
> jvmtiDeferredLocalVariableSet is
> declared.
> 
> > 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-31 Thread Robbin Ehn

Thanks for cleaning up thread.hpp!

/Robbin

On 2020-03-30 10:31, Reingruber, Richard wrote:

Hi,

this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)

The change affects jvmti, hotspot and c2. Partial reviews are very welcome too.

Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/

Robbin, Martin, please let me know, if anything shouldn't be quite as you 
wanted it. Also find my
comments on your feedback below.

Robbin, can I count you as Reviewer for the runtime part?

Thanks, Richard.

--


DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
You can move both declaration and definition to that file, no need to clobber
thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)


Done.


Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own
hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.


I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting 
jvmtiDeferredLocalVariableSet is
declared.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


Thanks :)


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read better 
than "not_global_escape_in_scope", but your version is consistent with existing code, so 
no change request from my side.) Ok.


I've been thinking about this too and finally stayed with 
not_global_escape_in_scope. It's supposed
to mean an object whose escape state is not GlobalEscape is in scope.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a 
follow up change together with the test in order to make this webrev smaller, 
but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


Yes the change would be a little smaller. And if it helps I'll split it off. In 
general I prefer
patches that bring along a suitable amount of tests.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


It is the main goal of the enhancement. It is done for C2, but could be done 
for JVMCI compilers
with just a small effort as well.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and 
has_arg_escape. So the code could look like this:
   SafePointNode* sfn = sfn_worklist.at(next);
   sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
   if (sfn->is_CallJava()) {
 CallJavaNode* call = sfn->as_CallJava();
 call->set_arg_escape(has_arg_escape(call));
   }
This would also allow us to get rid of the found_..._escape_in_args variables 
making the loops better readable.


Done.


It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to 
be the way to do it (there are more such places). So it's ok.


Yeah. I copied the snippet.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which 
suspends the target thread (extra VM Operation).


Note that the target threads have to be suspended already for 
VM_GetOrSetLocal*. So it's mainly the
synchronization effect of EscapeBarrier::sync_and_suspend_one() that is 
required here. Also no extra
_handshake_ is executed, since sync_and_suspend_one() will find the target 
threads already
suspended.


VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to 
prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which 
resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a 
concern, here. So it's ok.



VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has 
non-globally escaping objects and other frames if they have arg escaping ones. 
Good.


It's not specifically the top frame, but the frame that is accessed.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in relock_objects is tricky, 
but looks correct.
Comments are sufficient to understand why things are done as they are 
implemented.



BiasedLocking related parts are complex, but we may get rid of them in the 
future (with BiasedLocking removal).
Anyway, looks correct, too.



Typo in comment: "regularily" => "regularly"



Deoptimization::fetch_unroll_info_helper is the only place where 
_jvmti_deferred_updates get deallocated (except JavaThread destructor). But I 
think we always go through 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-31 Thread Doerr, Martin
Hi Richard,

thanks for addressing all my points. I've looked over webrev.5 and I'm 
satisfied with your changes.


I had also promised to review the tests.

test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysisEnabled.java
Thanks for updating the @summary comment. Looks good in webrev.5.

test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnalysisEnabled.c
JVMTI agent for object tagging and heap iteration. Good.

test/jdk/com/sun/jdi/EATests.java
This is a substantial amount of tests which is appropriate for a such a large 
change. Skipping some subtests with UseJVMCICompiler makes sense because it 
doesn't provide the necessary JVMTI functionality, yet.
Nice work!
I also like that you test with and without BiasedLocking. Your tests will still 
be fine after BiasedLocking deprecation.

Very minor nits:
- 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are 
ommitted" (sounds funny)
- You sometimes write "graal" and sometimes "Graal". I guess the capital G is 
better. (Also in EATestsJVMCI.java.)

test/jdk/com/sun/jdi/EATestsJVMCI.java
EATests with Graal enabled. Nice that you support Graal to some extent. Maybe 
Graal folks want to enhance them in the future. I think this is a good starting 
point.


Conclusion: Looks good and not trivial :-)
Now, you have one full review. I'd be ok with covering 2nd review by partial 
reviews.
Compiler and JVMTI parts are not too complicated IMHO.
Runtime part should get at least one additional careful review.

Best regards,
Martin


> -Original Message-
> From: Reingruber, Richard 
> Sent: Montag, 30. März 2020 10:32
> To: Doerr, Martin ; 'Robbin Ehn'
> ; Lindenmaier, Goetz
> ; David Holmes ;
> Vladimir Kozlov (vladimir.koz...@oracle.com)
> ; serviceability-dev@openjdk.java.net;
> hotspot-compiler-...@openjdk.java.net; hotspot-runtime-
> d...@openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi,
> 
> this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> 
> The change affects jvmti, hotspot and c2. Partial reviews are very welcome
> too.
> 
> Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> 
> Robbin, Martin, please let me know, if anything shouldn't be quite as you
> wanted it. Also find my
> comments on your feedback below.
> 
> Robbin, can I count you as Reviewer for the runtime part?
> 
> Thanks, Richard.
> 
> --
> 
> > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > You can move both declaration and definition to that file, no need to
> clobber
> > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> 
> Done.
> 
> > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's
> own
> > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> 
> I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting
> jvmtiDeferredLocalVariableSet is
> declared.
> 
> > src/hotspot/share/code/compiledMethod.cpp
> > Nice cleanup!
> 
> Thanks :)
> 
> > src/hotspot/share/code/debugInfoRec.cpp
> > src/hotspot/share/code/debugInfoRec.hpp
> > Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> would read better than "not_global_escape_in_scope", but your version is
> consistent with existing code, so no change request from my side.) Ok.
> 
> I've been thinking about this too and finally stayed with
> not_global_escape_in_scope. It's supposed
> to mean an object whose escape state is not GlobalEscape is in scope.
> 
> > src/hotspot/share/compiler/compileBroker.cpp
> > src/hotspot/share/compiler/compileBroker.hpp
> > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into
> a follow up change together with the test in order to make this webrev
> smaller, but since it is included, I'm reviewing everything at once. Not a big
> deal.) Ok.
> 
> Yes the change would be a little smaller. And if it helps I'll split it off. 
> In
> general I prefer
> patches that bring along a suitable amount of tests.
> 
> > src/hotspot/share/opto/c2compiler.cpp
> > Make do_escape_analysis independent of JVMCI capabilities. Nice!
> 
> It is the main goal of the enhancement. It is done for C2, but could be done
> for JVMCI compilers
> with just a small effort as well.
> 
> > src/hotspot/share/opto/escape.cpp
> > Annotation for MachSafePointNodes. Your added functionality looks
> correct.
> > But I'd prefer to move the bulky code out of the large function.
> > I suggest to factor out something like has_not_global_escape and
> has_arg_escape. So the code could look like this:
> >   SafePointNode* sfn = sfn_worklist.at(next);
> >   sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
> >   if (sfn->is_CallJava()) {
> > CallJavaNode* call = sfn->as_CallJava();
> > 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-30 Thread Reingruber, Richard
Hi,

this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)

The change affects jvmti, hotspot and c2. Partial reviews are very welcome too.

Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/

Robbin, Martin, please let me know, if anything shouldn't be quite as you 
wanted it. Also find my
comments on your feedback below.

Robbin, can I count you as Reviewer for the runtime part?

Thanks, Richard.

--

> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Done.

> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's 
> own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting 
jvmtiDeferredLocalVariableSet is
declared.

> src/hotspot/share/code/compiledMethod.cpp
> Nice cleanup!

Thanks :)

> src/hotspot/share/code/debugInfoRec.cpp
> src/hotspot/share/code/debugInfoRec.hpp
> Additional parmeters. (Remark: I think "non_global_escape_in_scope" would 
> read better than "not_global_escape_in_scope", but your version is consistent 
> with existing code, so no change request from my side.) Ok.

I've been thinking about this too and finally stayed with 
not_global_escape_in_scope. It's supposed
to mean an object whose escape state is not GlobalEscape is in scope.

> src/hotspot/share/compiler/compileBroker.cpp
> src/hotspot/share/compiler/compileBroker.hpp
> Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a 
> follow up change together with the test in order to make this webrev smaller, 
> but since it is included, I'm reviewing everything at once. Not a big deal.) 
> Ok.

Yes the change would be a little smaller. And if it helps I'll split it off. In 
general I prefer
patches that bring along a suitable amount of tests.

> src/hotspot/share/opto/c2compiler.cpp
> Make do_escape_analysis independent of JVMCI capabilities. Nice!

It is the main goal of the enhancement. It is done for C2, but could be done 
for JVMCI compilers
with just a small effort as well.

> src/hotspot/share/opto/escape.cpp
> Annotation for MachSafePointNodes. Your added functionality looks correct.
> But I'd prefer to move the bulky code out of the large function.
> I suggest to factor out something like has_not_global_escape and 
> has_arg_escape. So the code could look like this:
>   SafePointNode* sfn = sfn_worklist.at(next);
>   sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
>   if (sfn->is_CallJava()) {
> CallJavaNode* call = sfn->as_CallJava();
> call->set_arg_escape(has_arg_escape(call));
>   }
> This would also allow us to get rid of the found_..._escape_in_args variables 
> making the loops better readable.

Done.

> It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to 
> be the way to do it (there are more such places). So it's ok.

Yeah. I copied the snippet.

> src/hotspot/share/prims/jvmtiImpl.cpp
> src/hotspot/share/prims/jvmtiImpl.hpp
> The sequence is pretty complex:
> VM_GetOrSetLocal element initialization executes EscapeBarrier code which 
> suspends the target thread (extra VM Operation).

Note that the target threads have to be suspended already for 
VM_GetOrSetLocal*. So it's mainly the
synchronization effect of EscapeBarrier::sync_and_suspend_one() that is 
required here. Also no extra
_handshake_ is executed, since sync_and_suspend_one() will find the target 
threads already
suspended.

> VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread 
> to prepare VM Operation with frame deoptimization).
> VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which 
> resumes the target thread.
> But I don't have any improvement proposal. Performance is probably not a 
> concern, here. So it's ok.

> VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has 
> non-globally escaping objects and other frames if they have arg escaping 
> ones. Good.

It's not specifically the top frame, but the frame that is accessed.

> src/hotspot/share/runtime/deoptimization.cpp
> Object deoptimization. I have more comments and proposals, here.
> First of all, handling recursive and waiting locks in relock_objects is 
> tricky, but looks correct.
> Comments are sufficient to understand why things are done as they are 
> implemented.

> BiasedLocking related parts are complex, but we may get rid of them in the 
> future (with BiasedLocking removal).
> Anyway, looks correct, too.

> Typo in comment: "regularily" => "regularly"

> Deoptimization::fetch_unroll_info_helper is the only place where 
> _jvmti_deferred_updates get deallocated (except JavaThread destructor). But I 
> think we always go through it, 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-30 Thread Reingruber, Richard
Hi,

this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)

The change affects jvmti, hotspot and c2. Partial reviews are very welcome too.

Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
Delta: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/

Robbin, Martin, please let me know, if anything shouldn't be quite as you 
wanted it. Also find my
comments on your feedback below.

Robbin, can I count you as Reviewer for the runtime part?

Thanks, Richard.

-Original Message-
From: Doerr, Martin  
Sent: Donnerstag, 12. März 2020 17:28
To: Reingruber, Richard ; 'Robbin Ehn' 
; Lindenmaier, Goetz ; David 
Holmes ; Vladimir Kozlov (vladimir.koz...@oracle.com) 
; serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review 
the tests separately.)

First of all, the change seems to be in pretty good quality for its significant 
complexity. I couldn't find any real bugs. But I'd like to propose minor 
improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be 
reused for future escape analysis based optimizations. So I appreciate having 
it available in the code base.
In addition to that, your change makes the JVMTI implementation better 
integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read 
better than "not_global_escape_in_scope", but your version is consistent with 
existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a 
follow up change together with the test in order to make this webrev smaller, 
but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and 
has_arg_escape. So the code could look like this:
  SafePointNode* sfn = sfn_worklist.at(next);
  sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
  if (sfn->is_CallJava()) {
CallJavaNode* call = sfn->as_CallJava();
call->set_arg_escape(has_arg_escape(call));
  }
This would also allow us to get rid of the found_..._escape_in_args variables 
making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to 
be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which 
suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to 
prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which 
resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a 
concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has 
non-globally escaping objects and other frames if they have arg 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-13 Thread Reingruber, Richard
Hi Martin,

thanks a lot for reviewing and the feedback. I'll dig into the details as soon 
as possible. Looking forward to it :)

Thanks, Richard.

-Original Message-
From: Doerr, Martin  
Sent: Donnerstag, 12. März 2020 17:28
To: Reingruber, Richard ; 'Robbin Ehn' 
; Lindenmaier, Goetz ; David 
Holmes ; Vladimir Kozlov (vladimir.koz...@oracle.com) 
; serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review 
the tests separately.)

First of all, the change seems to be in pretty good quality for its significant 
complexity. I couldn't find any real bugs. But I'd like to propose minor 
improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be 
reused for future escape analysis based optimizations. So I appreciate having 
it available in the code base.
In addition to that, your change makes the JVMTI implementation better 
integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read 
better than "not_global_escape_in_scope", but your version is consistent with 
existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a 
follow up change together with the test in order to make this webrev smaller, 
but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and 
has_arg_escape. So the code could look like this:
  SafePointNode* sfn = sfn_worklist.at(next);
  sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
  if (sfn->is_CallJava()) {
CallJavaNode* call = sfn->as_CallJava();
call->set_arg_escape(has_arg_escape(call));
  }
This would also allow us to get rid of the found_..._escape_in_args variables 
making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to 
be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which 
suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to 
prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which 
resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a 
concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has 
non-globally escaping objects and other frames if they have arg escaping ones. 
Good.


src/hotspot/share/prims/jvmtiTagMap.cpp
Escape barriers + deoptimize objects for all threads. Ok.


src/hotspot/share/prims/whitebox.cpp
Added WB_IsFrameDeoptimized to API. Ok.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-12 Thread Doerr, Martin
Hi Richard,


I managed to find time for a (almost) complete review of webrev.4. (I'll review 
the tests separately.)

First of all, the change seems to be in pretty good quality for its significant 
complexity. I couldn't find any real bugs. But I'd like to propose minor 
improvements.
I'm convinced that it's mature because we did substantial testing.

I like the new functionality for object deoptimization. It can possibly be 
reused for future escape analysis based optimizations. So I appreciate having 
it available in the code base.
In addition to that, your change makes the JVMTI implementation better 
integrated into the VM.


Now to the details:


src/hotspot/share/c1/c1_IR.hpp
describe_scope parameters. Ok.


src/hotspot/share/ci/ciEnv.cpp
src/hotspot/share/ci/ciEnv.hpp
Fix for JvmtiExport::can_walk_any_space() capability. Ok.


src/hotspot/share/code/compiledMethod.cpp
Nice cleanup!


src/hotspot/share/code/debugInfoRec.cpp
src/hotspot/share/code/debugInfoRec.hpp
Additional parmeters. (Remark: I think "non_global_escape_in_scope" would read 
better than "not_global_escape_in_scope", but your version is consistent with 
existing code, so no change request from my side.) Ok.


src/hotspot/share/code/nmethod.cpp
Nice cleanup!


src/hotspot/share/code/pcDesc.hpp
Additional parameters. Ok.


src/hotspot/share/code/scopeDesc.cpp
src/hotspot/share/code/scopeDesc.hpp
Improved implementation + additional parameters. Ok.


src/hotspot/share/compiler/compileBroker.cpp
src/hotspot/share/compiler/compileBroker.hpp
Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a 
follow up change together with the test in order to make this webrev smaller, 
but since it is included, I'm reviewing everything at once. Not a big deal.) Ok.


src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
Additional parameters. Ok.


src/hotspot/share/opto/c2compiler.cpp
Make do_escape_analysis independent of JVMCI capabilities. Nice!


src/hotspot/share/opto/callnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/escape.cpp
Annotation for MachSafePointNodes. Your added functionality looks correct.
But I'd prefer to move the bulky code out of the large function.
I suggest to factor out something like has_not_global_escape and 
has_arg_escape. So the code could look like this:
  SafePointNode* sfn = sfn_worklist.at(next);
  sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
  if (sfn->is_CallJava()) {
CallJavaNode* call = sfn->as_CallJava();
call->set_arg_escape(has_arg_escape(call));
  }
This would also allow us to get rid of the found_..._escape_in_args variables 
making the loops better readable.

It's kind of ugly to use strcmp to recognize uncommon trap, but that seems to 
be the way to do it (there are more such places). So it's ok.


src/hotspot/share/opto/machnode.hpp
Additional fields for MachSafePointNodes. Ok.


src/hotspot/share/opto/macro.cpp
Allow elimination of non-escaping allocations. Ok.


src/hotspot/share/opto/matcher.cpp
src/hotspot/share/opto/output.cpp
Copy attribute / pass parameters. Ok.


src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
Nice cleanup!


src/hotspot/share/prims/jvmtiEnv.cpp
src/hotspot/share/prims/jvmtiEnvBase.cpp
Escape barriers + deoptimize objects for target thread. Good.


src/hotspot/share/prims/jvmtiImpl.cpp
src/hotspot/share/prims/jvmtiImpl.hpp
The sequence is pretty complex:
VM_GetOrSetLocal element initialization executes EscapeBarrier code which 
suspends the target thread (extra VM Operation).
VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM Thread to 
prepare VM Operation with frame deoptimization).
VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor which 
resumes the target thread.
But I don't have any improvement proposal. Performance is probably not a 
concern, here. So it's ok.

VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has 
non-globally escaping objects and other frames if they have arg escaping ones. 
Good.


src/hotspot/share/prims/jvmtiTagMap.cpp
Escape barriers + deoptimize objects for all threads. Ok.


src/hotspot/share/prims/whitebox.cpp
Added WB_IsFrameDeoptimized to API. Ok.


src/hotspot/share/runtime/deoptimization.cpp
Object deoptimization. I have more comments and proposals, here.
First of all, handling recursive and waiting locks in relock_objects is tricky, 
but looks correct.
Comments are sufficient to understand why things are done as they are 
implemented.

BiasedLocking related parts are complex, but we may get rid of them in the 
future (with BiasedLocking removal).
Anyway, looks correct, too.

Typo in comment: "regularily" => "regularly"

Deoptimization::fetch_unroll_info_helper is the only place where 
_jvmti_deferred_updates get deallocated (except JavaThread destructor). But I 
think we always go through it, so I can't see a memory leak or such kind of 
issues.

EscapeBarrier::deoptimize_objects: 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-03 Thread Reingruber, Richard
Hi Robbin,

> > I understand that Robbin proposed to replace the usage of
> > _suspend_flag with handshakes. Apparently, async handshakes
> > are needed to do so. We have been waiting a while for removal
> > of the _suspend_flag / introduction of async handshakes [2].
> > What is the status here?

> I have an old prototype which I would like to continue to work on.
> So do not assume asynch handshakes will make 15.
> Even if it would, I think there are a lot more investigate work to remove
> _suspend_flag.

Let us know, if we can be of any help to you and be it only testing.

> >> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/

> DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> You can move both declaration and definition to that file, no need to clobber
> thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Will do.

> Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's 
> own
> hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

You are right. It shouldn't be declared in thread.hpp. I will look into that.

> Note that we also think we may have a bug in deopt:
> https://bugs.openjdk.java.net/browse/JDK-8238237

> I think it would be best, if possible, to push after that is resolved.

Sure.

> Not even nearly a full review :)

I know :)

Anyways, thanks a lot,
Richard.


-Original Message-
From: Robbin Ehn  
Sent: Monday, March 2, 2020 11:17 AM
To: Lindenmaier, Goetz ; Reingruber, Richard 
; David Holmes ; Vladimir 
Kozlov (vladimir.koz...@oracle.com) ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi,

On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> Hi,
> 
> I had a look at the progress of this change. Nothing
> happened since Richard posted his update using more
> handshakes [1].
> But we (SAP) would appreciate a lot if this change could
> be successfully reviewed and pushed.
> 
> I think there is basic understanding that this
> change is helpful. It fixes a number of issues with JVMTI,
> and will deliver the same performance benefits as EA
> does in current production mode for debugging scenarios.
> 
> This is important for us as we run our VMs prepared
> for debugging in production mode.
> 
> I understand that Robbin proposed to replace the usage of
> _suspend_flag with handshakes. Apparently, async handshakes
> are needed to do so. We have been waiting a while for removal
> of the _suspend_flag / introduction of async handshakes [2].
> What is the status here?

I have an old prototype which I would like to continue to work on.
So do not assume asynch handshakes will make 15.
Even if it would, I think there are a lot more investigate work to remove
_suspend_flag.

> 
> I think we should no longer wait, but proceed with
> this change. We will look into removing the usage of
> suspend_flag introduced here once it is possible to implement
> it with handshakes.

Yes, sure.

>> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/

DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
You can move both declaration and definition to that file, no need to clobber 
thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)

Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own 
hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.

Note that we also think we may have a bug in deopt:
https://bugs.openjdk.java.net/browse/JDK-8238237

I think it would be best, if possible, to push after that is resolved.

Not even nearly a full review :)

Thanks, Robbin


>> Incremental:
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
>>
>> I was not able to eliminate the additional suspend flag now. I'll take care 
>> of this
>> as soon as the
>> existing suspend-resume-mechanism is reworked.
>>
>> Testing:
>>
>> Nightly tests @SAP:
>>
>>JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance
>> Suite, SAP specific tests
>>with fastdebug and release builds on all platforms
>>
>>Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel
>> for 24h
>>
>> Thanks, Richard.
>>
>>
>> More details on the changes:
>>
>> * Hide DeoptimizeObjectsALotThread from external view.
>>
>> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
>>It used to be _safepoint_check_sometimes, which will be eliminated sooner 
>> or
>> later.
>>I added explicit thread state changes with ThreadBlockInVM to code paths
>> where we can wait()
>>on EscapeBarrier_lock to become safepoint safe.
>>
>> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads
>> instead of vm operation
>>VM_ThreadSuspendAllForObjDeopt.
>>
>> * Removed uses of Threads_lock. When adding a new thread we suspend 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-03-02 Thread Robbin Ehn

Hi,

On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:

Hi,

I had a look at the progress of this change. Nothing
happened since Richard posted his update using more
handshakes [1].
But we (SAP) would appreciate a lot if this change could
be successfully reviewed and pushed.

I think there is basic understanding that this
change is helpful. It fixes a number of issues with JVMTI,
and will deliver the same performance benefits as EA
does in current production mode for debugging scenarios.

This is important for us as we run our VMs prepared
for debugging in production mode.

I understand that Robbin proposed to replace the usage of
_suspend_flag with handshakes. Apparently, async handshakes
are needed to do so. We have been waiting a while for removal
of the _suspend_flag / introduction of async handshakes [2].
What is the status here?


I have an old prototype which I would like to continue to work on.
So do not assume asynch handshakes will make 15.
Even if it would, I think there are a lot more investigate work to remove
_suspend_flag.



I think we should no longer wait, but proceed with
this change. We will look into removing the usage of
suspend_flag introduced here once it is possible to implement
it with handshakes.


Yes, sure.


Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/


DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
You can move both declaration and definition to that file, no need to clobber 
thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)


Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in it's own 
hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.


Note that we also think we may have a bug in deopt:
https://bugs.openjdk.java.net/browse/JDK-8238237

I think it would be best, if possible, to push after that is resolved.

Not even nearly a full review :)

Thanks, Robbin



Incremental:
http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/

I was not able to eliminate the additional suspend flag now. I'll take care of 
this
as soon as the
existing suspend-resume-mechanism is reworked.

Testing:

Nightly tests @SAP:

   JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance
Suite, SAP specific tests
   with fastdebug and release builds on all platforms

   Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel
for 24h

Thanks, Richard.


More details on the changes:

* Hide DeoptimizeObjectsALotThread from external view.

* Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
   It used to be _safepoint_check_sometimes, which will be eliminated sooner or
later.
   I added explicit thread state changes with ThreadBlockInVM to code paths
where we can wait()
   on EscapeBarrier_lock to become safepoint safe.

* Use handshake EscapeBarrierSuspendHandshake to suspend target threads
instead of vm operation
   VM_ThreadSuspendAllForObjDeopt.

* Removed uses of Threads_lock. When adding a new thread we suspend it iff
EA optimizations are
   being reverted. In the previous version we were waiting on Threads_lock
while EA optimizations
   were reverted. See EscapeBarrier::thread_added().

* Made tests require Xmixed compilation mode.

* Made tests agnostic regarding tiered compilation.
   I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
disabled.

* Exercising EATests.java as well with stress test options
DeoptimizeObjectsALot*
   Due to the non-deterministic deoptimizations some tests need to be skipped.
   We do this to prevent bit-rot of the stress test code.

* Executing EATests.java as well with graal if available. Driver for this is
   EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide 
all
the new debug info
   (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
   And graal does not yet support the JVMTI operations force early return and
pop frame.

* Removed tracing from new jdi tests in EATests.java. Too much trace output
before the debugging
   connection is established can cause deadlock because output buffers fill up.
   (See https://bugs.openjdk.java.net/browse/JDK-8173304)

* Many copyright year changes and smaller clean-up changes of testing code
(trailing white-space and
   the like).


-Original Message-
From: David Holmes 
Sent: Donnerstag, 19. Dezember 2019 03:12
To: Reingruber, Richard ; serviceability-
d...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-
runtime-...@openjdk.java.net; Vladimir Kozlov (vladimir.koz...@oracle.com)

Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
the Presence of JVMTI Agents

Hi Richard,

I think my issue is with the way EliminateNestedLocks works so I'm going
to look into that more deeply.

Thanks for the explanations.

David

On 18/12/2019 12:47 am, Reingruber, Richard wrote:

Hi David,

> >> Some further queries/concerns:
> >>
> >> 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-02-24 Thread Lindenmaier, Goetz
Hi,

I had a look at the progress of this change. Nothing
happened since Richard posted his update using more
handshakes [1].
But we (SAP) would appreciate a lot if this change could
be successfully reviewed and pushed.

I think there is basic understanding that this 
change is helpful. It fixes a number of issues with JVMTI,
and will deliver the same performance benefits as EA 
does in current production mode for debugging scenarios.

This is important for us as we run our VMs prepared
for debugging in production mode.

I understand that Robbin proposed to replace the usage of 
_suspend_flag with handshakes. Apparently, async handshakes
are needed to do so. We have been waiting a while for removal 
of the _suspend_flag / introduction of async handshakes [2].
What is the status here?

I think we should no longer wait, but proceed with 
this change. We will look into removing the usage of 
suspend_flag introduced here once it is possible to implement 
it with handshakes.

Also, I think it's a good point in time to push this, as 
jdk15 is at the beginning of development.

Best regards,
  Goetz.


[1] 
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/037984.html

[2] 
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037533.html


> -Original Message-
> From: hotspot-runtime-dev 
> On Behalf Of Reingruber, Richard
> Sent: Dienstag, 4. Februar 2020 09:59
> To: David Holmes ; Vladimir Kozlov
> (vladimir.koz...@oracle.com) ; Robbin Ehn
> ; serviceability-dev@openjdk.java.net; hotspot-
> compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net
> Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance in the Presence of JVMTI Agents
> 
> Hi,
> 
> I have prepared webrev.4 that incorporates feedback from webrev.3 (thanks!)
> 
> Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> Incremental:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> 
> I was not able to eliminate the additional suspend flag now. I'll take care 
> of this
> as soon as the
> existing suspend-resume-mechanism is reworked.
> 
> Testing:
> 
> Nightly tests @SAP:
> 
>   JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance
> Suite, SAP specific tests
>   with fastdebug and release builds on all platforms
> 
>   Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel
> for 24h
> 
> Thanks, Richard.
> 
> 
> More details on the changes:
> 
> * Hide DeoptimizeObjectsALotThread from external view.
> 
> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
>   It used to be _safepoint_check_sometimes, which will be eliminated sooner or
> later.
>   I added explicit thread state changes with ThreadBlockInVM to code paths
> where we can wait()
>   on EscapeBarrier_lock to become safepoint safe.
> 
> * Use handshake EscapeBarrierSuspendHandshake to suspend target threads
> instead of vm operation
>   VM_ThreadSuspendAllForObjDeopt.
> 
> * Removed uses of Threads_lock. When adding a new thread we suspend it iff
> EA optimizations are
>   being reverted. In the previous version we were waiting on Threads_lock
> while EA optimizations
>   were reverted. See EscapeBarrier::thread_added().
> 
> * Made tests require Xmixed compilation mode.
> 
> * Made tests agnostic regarding tiered compilation.
>   I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or
> disabled.
> 
> * Exercising EATests.java as well with stress test options
> DeoptimizeObjectsALot*
>   Due to the non-deterministic deoptimizations some tests need to be skipped.
>   We do this to prevent bit-rot of the stress test code.
> 
> * Executing EATests.java as well with graal if available. Driver for this is
>   EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide 
> all
> the new debug info
>   (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
>   And graal does not yet support the JVMTI operations force early return and
> pop frame.
> 
> * Removed tracing from new jdi tests in EATests.java. Too much trace output
> before the debugging
>   connection is established can cause deadlock because output buffers fill up.
>   (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> 
> * Many copyright year changes and smaller clean-up changes of testing code
> (trailing white-space and
>   the like).
> 
> 
> -Original Message-
> From: David Holmes 
> Sent: Donnerstag, 19. Dezember 2019 03:12
> To: Reingruber, Richard ; serviceability-
> d...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-
> runtime-...@openjdk.java.net; Vladimir Kozlov (vladimir.koz...@oracle.com)
> 
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
> the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> I think my issue is with the way EliminateNestedLocks works so I'm going
> to look into that more deeply.
> 
> Thanks for the 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-02-21 Thread Reingruber, Richard
Ping :)

Richard.

-Original Message-
From: hotspot-compiler-dev  On 
Behalf Of Reingruber, Richard
Sent: Dienstag, 4. Februar 2020 09:59
To: David Holmes ; Vladimir Kozlov 
(vladimir.koz...@oracle.com) ; Robbin Ehn 
; serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net
Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better 
Performance in the Presence of JVMTI Agents

Hi,

I have prepared webrev.4 that incorporates feedback from webrev.3 (thanks!)

Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
Incremental: 
http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/

I was not able to eliminate the additional suspend flag now. I'll take care of 
this as soon as the
existing suspend-resume-mechanism is reworked.

Testing:

Nightly tests @SAP:

  JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance 
Suite, SAP specific tests
  with fastdebug and release builds on all platforms

  Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel 
for 24h

Thanks, Richard.


More details on the changes:

* Hide DeoptimizeObjectsALotThread from external view.

* Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
  It used to be _safepoint_check_sometimes, which will be eliminated sooner or 
later.
  I added explicit thread state changes with ThreadBlockInVM to code paths 
where we can wait()
  on EscapeBarrier_lock to become safepoint safe.

* Use handshake EscapeBarrierSuspendHandshake to suspend target threads instead 
of vm operation
  VM_ThreadSuspendAllForObjDeopt.

* Removed uses of Threads_lock. When adding a new thread we suspend it iff EA 
optimizations are
  being reverted. In the previous version we were waiting on Threads_lock while 
EA optimizations
  were reverted. See EscapeBarrier::thread_added().

* Made tests require Xmixed compilation mode.

* Made tests agnostic regarding tiered compilation.
  I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or 
disabled.

* Exercising EATests.java as well with stress test options 
DeoptimizeObjectsALot*
  Due to the non-deterministic deoptimizations some tests need to be skipped.
  We do this to prevent bit-rot of the stress test code.

* Executing EATests.java as well with graal if available. Driver for this is
  EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide 
all the new debug info
  (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
  And graal does not yet support the JVMTI operations force early return and 
pop frame.

* Removed tracing from new jdi tests in EATests.java. Too much trace output 
before the debugging
  connection is established can cause deadlock because output buffers fill up.
  (See https://bugs.openjdk.java.net/browse/JDK-8173304)

* Many copyright year changes and smaller clean-up changes of testing code 
(trailing white-space and
  the like).


-Original Message-
From: David Holmes  
Sent: Donnerstag, 19. Dezember 2019 03:12
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net; Vladimir Kozlov 
(vladimir.koz...@oracle.com) 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

I think my issue is with the way EliminateNestedLocks works so I'm going 
to look into that more deeply.

Thanks for the explanations.

David

On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> Hi David,
> 
>> >> Some further queries/concerns:
>> >>
>> >> src/hotspot/share/runtime/objectMonitor.cpp
>> >>
>> >> Can you please explain the changes to ObjectMonitor::wait:
>> >>
>> >> !   _recursions = save  // restore the old recursion count
>> >> ! + jt->get_and_reset_relock_count_after_wait(); 
> //
>> >> increased by the deferred relock count
>> >>
>> >> what is the "deferred relock count"? I gather it relates to
>> >>
>> >> "The code was extended to be able to deoptimize objects of a
>> > frame that
>> >> is not the top frame and to let another thread than the owning
>> > thread do
>> >> it."
>> >
>> > Yes, these relate. Currently EA based optimizations are reverted, when 
> a compiled frame is
>> > replaced with corresponding interpreter frames. Part of this is 
> relocking objects with eliminated
>> > locking. New with the enhancement is that we do this also just before 
> object references are
>> > acquired through JVMTI. In this case we deoptimize also the owning 
> compiled frame C and we
>> > register deoptimized objects as deferred updates. When control returns 
> to C it gets deoptimized,
>> > we notice that objects are already deoptimized (reallocated and 
> relocked), so we don't do 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2020-02-04 Thread Reingruber, Richard
Hi,

I have prepared webrev.4 that incorporates feedback from webrev.3 (thanks!)

Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
Incremental: 
http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/

I was not able to eliminate the additional suspend flag now. I'll take care of 
this as soon as the
existing suspend-resume-mechanism is reworked.

Testing:

Nightly tests @SAP:

  JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance 
Suite, SAP specific tests
  with fastdebug and release builds on all platforms

  Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel 
for 24h

Thanks, Richard.


More details on the changes:

* Hide DeoptimizeObjectsALotThread from external view.

* Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
  It used to be _safepoint_check_sometimes, which will be eliminated sooner or 
later.
  I added explicit thread state changes with ThreadBlockInVM to code paths 
where we can wait()
  on EscapeBarrier_lock to become safepoint safe.

* Use handshake EscapeBarrierSuspendHandshake to suspend target threads instead 
of vm operation
  VM_ThreadSuspendAllForObjDeopt.

* Removed uses of Threads_lock. When adding a new thread we suspend it iff EA 
optimizations are
  being reverted. In the previous version we were waiting on Threads_lock while 
EA optimizations
  were reverted. See EscapeBarrier::thread_added().

* Made tests require Xmixed compilation mode.

* Made tests agnostic regarding tiered compilation.
  I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or 
disabled.

* Exercising EATests.java as well with stress test options 
DeoptimizeObjectsALot*
  Due to the non-deterministic deoptimizations some tests need to be skipped.
  We do this to prevent bit-rot of the stress test code.

* Executing EATests.java as well with graal if available. Driver for this is
  EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide 
all the new debug info
  (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp).
  And graal does not yet support the JVMTI operations force early return and 
pop frame.

* Removed tracing from new jdi tests in EATests.java. Too much trace output 
before the debugging
  connection is established can cause deadlock because output buffers fill up.
  (See https://bugs.openjdk.java.net/browse/JDK-8173304)

* Many copyright year changes and smaller clean-up changes of testing code 
(trailing white-space and
  the like).


-Original Message-
From: David Holmes  
Sent: Donnerstag, 19. Dezember 2019 03:12
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net; Vladimir Kozlov 
(vladimir.koz...@oracle.com) 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

I think my issue is with the way EliminateNestedLocks works so I'm going 
to look into that more deeply.

Thanks for the explanations.

David

On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> Hi David,
> 
>> >> Some further queries/concerns:
>> >>
>> >> src/hotspot/share/runtime/objectMonitor.cpp
>> >>
>> >> Can you please explain the changes to ObjectMonitor::wait:
>> >>
>> >> !   _recursions = save  // restore the old recursion count
>> >> ! + jt->get_and_reset_relock_count_after_wait(); 
> //
>> >> increased by the deferred relock count
>> >>
>> >> what is the "deferred relock count"? I gather it relates to
>> >>
>> >> "The code was extended to be able to deoptimize objects of a
>> > frame that
>> >> is not the top frame and to let another thread than the owning
>> > thread do
>> >> it."
>> >
>> > Yes, these relate. Currently EA based optimizations are reverted, when 
> a compiled frame is
>> > replaced with corresponding interpreter frames. Part of this is 
> relocking objects with eliminated
>> > locking. New with the enhancement is that we do this also just before 
> object references are
>> > acquired through JVMTI. In this case we deoptimize also the owning 
> compiled frame C and we
>> > register deoptimized objects as deferred updates. When control returns 
> to C it gets deoptimized,
>> > we notice that objects are already deoptimized (reallocated and 
> relocked), so we don't do it again
>> > (relocking twice would be incorrect of course). Deferred updates are 
> copied into the new
>> > interpreter frames.
>> >
>> > Problem: relocking is not possible if the target thread T is waiting 
> on the monitor that needs to
>> > be relocked. This happens only with non-local objects with 
> EliminateNestedLocks. Instead relocking
>> > is deferred until T owns the monitor again. This is what the piece of 
> code 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-23 Thread Reingruber, Richard
Hi,

webrev.3 didn't apply anymore after 8236000 [1]. I've rebased and updated in 
place:

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

The change was minimal.

Cheers, Richard.

[1] JDK-8236000: VM build without C2 fails

-Original Message-
From: Reingruber, Richard 
Sent: Dienstag, 10. Dezember 2019 22:45
To: serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the 
Presence of JVMTI Agents

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
(thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape 
analysis even if JVMTI
agents request capabilities that allow them to access local variable values. 
E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do 
so. With the
enhancement, escape analysis will remain enabled until and after a debugger 
attaches. EA based
optimizations are reverted just before an agent acquires the reference to an 
object. In the JBS item
you'll find more details.

Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch


Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-18 Thread David Holmes

Hi Richard,

I think my issue is with the way EliminateNestedLocks works so I'm going 
to look into that more deeply.


Thanks for the explanations.

David

On 18/12/2019 12:47 am, Reingruber, Richard wrote:

Hi David,

   > >> Some further queries/concerns:
   > >>
   > >> src/hotspot/share/runtime/objectMonitor.cpp
   > >>
   > >> Can you please explain the changes to ObjectMonitor::wait:
   > >>
   > >> !   _recursions = save  // restore the old recursion count
   > >> ! + jt->get_and_reset_relock_count_after_wait(); //
   > >> increased by the deferred relock count
   > >>
   > >> what is the "deferred relock count"? I gather it relates to
   > >>
   > >> "The code was extended to be able to deoptimize objects of a
   > > frame that
   > >> is not the top frame and to let another thread than the owning
   > > thread do
   > >> it."
   > >
   > > Yes, these relate. Currently EA based optimizations are reverted, when a 
compiled frame is
   > > replaced with corresponding interpreter frames. Part of this is 
relocking objects with eliminated
   > > locking. New with the enhancement is that we do this also just before 
object references are
   > > acquired through JVMTI. In this case we deoptimize also the owning 
compiled frame C and we
   > > register deoptimized objects as deferred updates. When control returns 
to C it gets deoptimized,
   > > we notice that objects are already deoptimized (reallocated and 
relocked), so we don't do it again
   > > (relocking twice would be incorrect of course). Deferred updates are 
copied into the new
   > > interpreter frames.
   > >
   > > Problem: relocking is not possible if the target thread T is waiting on 
the monitor that needs to
   > > be relocked. This happens only with non-local objects with 
EliminateNestedLocks. Instead relocking
   > > is deferred until T owns the monitor again. This is what the piece of 
code above does.
   >
   >  Sorry I need some more detail here. How can you wait() on an object
   >  monitor if the object allocation and/or locking was optimised away? And
   >  what is a "non-local object" in this context? Isn't EA restricted to
   >  thread-confined objects?

"Non-local object" is an object that escapes its thread. The issue I'm 
addressing with the changes
in ObjectMonitor::wait are almost unrelated to EA. They are caused by 
EliminateNestedLocks, where C2
eliminates recursive locking of an already owned lock. The lock owning object 
exists on the heap, it
is locked and you can call wait() on it.

EliminateLocks is the C2 option that controls lock elimination based on EA.  
Both optimizations have
in common that objects with eliminated locking need to be relocked when 
deoptimizing a frame,
i.e. when replacing a compiled frame with equivalent interpreter
frames. Deoptimization::relock_objects does that job for /all/ eliminated locks 
in scope. /All/ can
be a mix of eliminated nested locks and locks of not-escaping objects.

New with the enhancement: I call relock_objects earlier, just before objects 
pontentially
escape. But then later when the owning compiled frame gets deoptimized, I must 
not do it again:

See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:

  373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && 
EliminateLocks))
  374   && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
  375 bool unused;
  376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, 
unused);
  377   }

Now when calling relock_objects early it is quiet possible that I have to 
relock an object the
target thread currently waits for. Obviously I cannot relock in this case, 
instead I chose to
introduce relock_count_after_wait to JavaThread.

   >  Is it just that some of the locking gets optimized away e.g.
   >
   >  synchronised(obj) {
   > synchronised(obj) {
   >   synchronised(obj) {
   > obj.wait();
   >   }
   > }
   >  }
   >
   >  If this is reduced to a form as-if it were a single lock of the monitor
   >  (due to EA) and the wait() triggers a JVM TI event which leads to the
   >  escape of "obj" then we need to reconstruct the true lock state, and so
   >  when the wait() internally unblocks and reacquires the monitor it has to
   >  set the true recursion count to 3, not the 1 that it appeared to be when
   >  wait() was initially called. Is that the scenario?

Kind of... except that the locking is not eliminated due to EA and there is no 
JVM TI event
triggered by wait.

Add

LocalObject l1 = new LocalObject();

in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This 
triggers the code in
question.

See that relocking/reallocating is transactional. If it is done then for /all/ 
objects in scope and it is
done at most once. It wouldn't be quite so easy to split this in relocking of 
nested/EA-based
eliminated locks.

   >  If 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-17 Thread Reingruber, Richard
Hi David,

  > >> Some further queries/concerns:
  > >>
  > >> src/hotspot/share/runtime/objectMonitor.cpp
  > >>
  > >> Can you please explain the changes to ObjectMonitor::wait:
  > >>
  > >> !   _recursions = save  // restore the old recursion count
  > >> ! + jt->get_and_reset_relock_count_after_wait(); //
  > >> increased by the deferred relock count
  > >>
  > >> what is the "deferred relock count"? I gather it relates to
  > >>
  > >> "The code was extended to be able to deoptimize objects of a 
  > > frame that
  > >> is not the top frame and to let another thread than the owning 
  > > thread do
  > >> it."
  > >
  > > Yes, these relate. Currently EA based optimizations are reverted, when a 
compiled frame is
  > > replaced with corresponding interpreter frames. Part of this is relocking 
objects with eliminated
  > > locking. New with the enhancement is that we do this also just before 
object references are
  > > acquired through JVMTI. In this case we deoptimize also the owning 
compiled frame C and we
  > > register deoptimized objects as deferred updates. When control returns to 
C it gets deoptimized,
  > > we notice that objects are already deoptimized (reallocated and 
relocked), so we don't do it again
  > > (relocking twice would be incorrect of course). Deferred updates are 
copied into the new
  > > interpreter frames.
  > >
  > > Problem: relocking is not possible if the target thread T is waiting on 
the monitor that needs to
  > > be relocked. This happens only with non-local objects with 
EliminateNestedLocks. Instead relocking
  > > is deferred until T owns the monitor again. This is what the piece of 
code above does.
  >  
  >  Sorry I need some more detail here. How can you wait() on an object 
  >  monitor if the object allocation and/or locking was optimised away? And 
  >  what is a "non-local object" in this context? Isn't EA restricted to 
  >  thread-confined objects?

"Non-local object" is an object that escapes its thread. The issue I'm 
addressing with the changes
in ObjectMonitor::wait are almost unrelated to EA. They are caused by 
EliminateNestedLocks, where C2
eliminates recursive locking of an already owned lock. The lock owning object 
exists on the heap, it
is locked and you can call wait() on it.

EliminateLocks is the C2 option that controls lock elimination based on EA.  
Both optimizations have
in common that objects with eliminated locking need to be relocked when 
deoptimizing a frame,
i.e. when replacing a compiled frame with equivalent interpreter
frames. Deoptimization::relock_objects does that job for /all/ eliminated locks 
in scope. /All/ can
be a mix of eliminated nested locks and locks of not-escaping objects.

New with the enhancement: I call relock_objects earlier, just before objects 
pontentially
escape. But then later when the owning compiled frame gets deoptimized, I must 
not do it again:

See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:

 373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && 
EliminateLocks))
 374   && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
 375 bool unused;
 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, 
unused);
 377   }

Now when calling relock_objects early it is quiet possible that I have to 
relock an object the
target thread currently waits for. Obviously I cannot relock in this case, 
instead I chose to
introduce relock_count_after_wait to JavaThread.

  >  Is it just that some of the locking gets optimized away e.g.
  >  
  >  synchronised(obj) {
  > synchronised(obj) {
  >   synchronised(obj) {
  > obj.wait();
  >   }
  > }
  >  }
  >  
  >  If this is reduced to a form as-if it were a single lock of the monitor 
  >  (due to EA) and the wait() triggers a JVM TI event which leads to the 
  >  escape of "obj" then we need to reconstruct the true lock state, and so 
  >  when the wait() internally unblocks and reacquires the monitor it has to 
  >  set the true recursion count to 3, not the 1 that it appeared to be when 
  >  wait() was initially called. Is that the scenario?

Kind of... except that the locking is not eliminated due to EA and there is no 
JVM TI event
triggered by wait.

Add

LocalObject l1 = new LocalObject();

in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This 
triggers the code in
question.

See that relocking/reallocating is transactional. If it is done then for /all/ 
objects in scope and it is
done at most once. It wouldn't be quite so easy to split this in relocking of 
nested/EA-based
eliminated locks.

  >  If so I find this truly awful. Anyone using wait() in a realistic form 
  >  requires a notification and so the object cannot be thread confined. In

It is not thread confined.

  >  which case I would strongly argue that upon hitting the wait() the 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-17 Thread Robbin Ehn

Hi Richard,

On 12/17/19 11:24 AM, Reingruber, Richard wrote:


   > So adding asynch handshakes and a per thread handshake queue, we can.
   > (which this test prototype does)

Yes, should work for my use case too.


Great.



   > The issue I'm thinking of is if we need selective polling first.
   > Suspend flags are not checked in every transition, e.g. vm->native.
   > A JVM TI agent don't expect to suspend it's own thread when suspending
   > all threads.
   > (that thread would be suspended when trying to get back to agent code
   > when it does vm->native transition)

Note that JVM TI doesn't offer "suspending all threads" directly. It offers 
SuspendThreadList [1]
which can be used to self-suspend: "If the calling thread is specified in the 
request_list array,
this function will not return until some other thread resumes it"


Maybe there is a test-bug here or it was more complicated scenario.
I have to investigate, but suspending threads in all transitions causes a chunk 
of test failure in jdi/jvmti.

The issue was suspending threads going vm->native (back to agent code).

Thanks, Robbin



Thanks, Richard.

[1] 
https://docs.oracle.com/en/java/javase/13/docs/specs/jvmti.html#SuspendThreadList

-Original Message-
From: Robbin Ehn 
Sent: Montag, 16. Dezember 2019 18:21
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 2019-12-16 14:41, Reingruber, Richard wrote:

Hi Robbin,

first of all: thanks a lot for providing feedback. I do appreciate it.

I am absolutely willing to move this to handshakes. Only I still can't see how 
to achieve it.

Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? 
[1]

I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() 
where calling_thread
T1 would apply it on another thread T2.


Sorry I don't immediately see what issue there is in doing a handshake
instead of:
VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread,
owned_monitors_list);



1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot 
move anymore?

2. Handshakes between two threads are synchronous, correct? If so, then T1 will 
block handshaking
 T2, because either T2 or the VMThread will block in L10.


Yes, sorry, I forgot/confused myself about asynch handshake.
(I have a test prototype for that, which removes suspend flag)



I cannot figure out, how you mean this. Only if a helper thread H would 
handshake T2 then T1 could
continue and call wait_until_eb_stopped(). But returning from there T1 would 
block if reallocating
objects triggers GC or attempting to execute the vm operation in
JvmtiEnv::GetOwnedMonitorStackDepthInfo().

It might be impossible to replace my suspend flag with handshakes that are 
available today, because
if it was you could replace all the suspend flags right away, couldn't you?


So adding asynch handshakes and a per thread handshake queue, we can.
(which this test prototype does)
The issue I'm thinking of is if we need selective polling first.
Suspend flags are not checked in every transition, e.g. vm->native.
A JVM TI agent don't expect to suspend it's own thread when suspending
all threads.
(that thread would be suspended when trying to get back to agent code
when it does vm->native transition)



Or I'm simply missing something... quite possible... :)


No I think you got it right.

Thanks, Robbin



Thanks, Richard.

[1] Drafted by Robbin (thanks!)

   1class EscapeBarrierSuspendHandshake : public HandshakeClosure {
   2  Semaphore _is_waiting;
   3  Semaphore _wait;
   4  bool _started;
   5public:
   6  EscapeBarrierSuspendHandshake() : 
HandshakeClosure("EscapeBarrierSuspend"),
   7  _wait(0), _started(false) { }
   8  void do_thread(Thread* th) {
   9_is_waiting.signal();
  10_wait.wait();
  11Atomic::store(&_started, true);
  12  }
  13  void wait_until_eb_stopped() { _is_waiting.wait(); }
  14  void start_thread() {
  15_wait.signal();
  16while(!Atomic::load(&_started)) {
  17  os::naked_yield();
  18}
  19  }
  20};

-Original Message-
From: Robbin Ehn 
Sent: Montag, 16. Dezember 2019 11:21
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard, as mentioned it would be better if you could do this with
handshakes, instead of using _suspend_flag (since they are going away).
But I can't 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-17 Thread Reingruber, Richard
Hi Robbin,

  > Sorry I don't immediately see what issue there is in doing a handshake 
  > instead of:
  > VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, 
  > owned_monitors_list);

VM_GetOwnedMonitorInfo /can/ be replaced by a handshake, but the calling_thread 
T1 needs to walk
java_thread T2's stack /before/ to reallocate and relock objects, because the 
GC interface does not
allow the VMThread to allocate from the java heap.

T1:
1. reallocate scalar replaced objects of T2   //  not possible as part of 
handshake/vmop,
  //  because GC interface does not 
allow VMThread
  //  to allocate from heap
2. execute VM_GetOwnedMonitorInfo() or equivalent handshake

while T2 is /not/ pushing new frames with EA based optimizations.

  > > 
  > > 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 
cannot move anymore?
  > > 
  > > 2. Handshakes between two threads are synchronous, correct? If so, then 
T1 will block handshaking
  > > T2, because either T2 or the VMThread will block in L10.
  > 
  > Yes, sorry, I forgot/confused myself about asynch handshake.
  > (I have a test prototype for that, which removes suspend flag)
  > 
  > > 
  > > I cannot figure out, how you mean this. Only if a helper thread H would 
handshake T2 then T1 could
  > > continue and call wait_until_eb_stopped(). But returning from there T1 
would block if reallocating
  > > objects triggers GC or attempting to execute the vm operation in
  > > JvmtiEnv::GetOwnedMonitorStackDepthInfo().
  > > 
  > > It might be impossible to replace my suspend flag with handshakes that 
are available today, because
  > > if it was you could replace all the suspend flags right away, couldn't 
you?
  > 
  > So adding asynch handshakes and a per thread handshake queue, we can. 
  > (which this test prototype does)

Yes, should work for my use case too.

  > The issue I'm thinking of is if we need selective polling first.
  > Suspend flags are not checked in every transition, e.g. vm->native.
  > A JVM TI agent don't expect to suspend it's own thread when suspending
  > all threads.
  > (that thread would be suspended when trying to get back to agent code
  > when it does vm->native transition)

Note that JVM TI doesn't offer "suspending all threads" directly. It offers 
SuspendThreadList [1]
which can be used to self-suspend: "If the calling thread is specified in the 
request_list array,
this function will not return until some other thread resumes it"

Thanks, Richard.

[1] 
https://docs.oracle.com/en/java/javase/13/docs/specs/jvmti.html#SuspendThreadList

-Original Message-
From: Robbin Ehn  
Sent: Montag, 16. Dezember 2019 18:21
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 2019-12-16 14:41, Reingruber, Richard wrote:
> Hi Robbin,
> 
> first of all: thanks a lot for providing feedback. I do appreciate it.
> 
> I am absolutely willing to move this to handshakes. Only I still can't see 
> how to achieve it.
> 
> Could you explain the drafted class EscapeBarrierSuspendHandshake a little 
> bit? [1]
> 
> I'd like to look at it by example of 
> JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread
> T1 would apply it on another thread T2.

Sorry I don't immediately see what issue there is in doing a handshake 
instead of:
VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, 
owned_monitors_list);

> 
> 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot 
> move anymore?
> 
> 2. Handshakes between two threads are synchronous, correct? If so, then T1 
> will block handshaking
> T2, because either T2 or the VMThread will block in L10.

Yes, sorry, I forgot/confused myself about asynch handshake.
(I have a test prototype for that, which removes suspend flag)

> 
> I cannot figure out, how you mean this. Only if a helper thread H would 
> handshake T2 then T1 could
> continue and call wait_until_eb_stopped(). But returning from there T1 would 
> block if reallocating
> objects triggers GC or attempting to execute the vm operation in
> JvmtiEnv::GetOwnedMonitorStackDepthInfo().
> 
> It might be impossible to replace my suspend flag with handshakes that are 
> available today, because
> if it was you could replace all the suspend flags right away, couldn't you?

So adding asynch handshakes and a per thread handshake queue, we can. 
(which this test prototype does)
The issue I'm thinking of is if we need selective polling first.
Suspend flags are not checked in every transition, e.g. vm->native.
A JVM TI agent don't expect to suspend it's own thread when suspending
all threads.
(that thread would be suspended when trying to get back to agent code
when it does vm->native 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-16 Thread David Holmes



David

On 17/12/2019 4:57 pm, David Holmes wrote:

Hi Richard,

On 14/12/2019 5:01 am, Reingruber, Richard wrote:

Hi David,

   > Some further queries/concerns:
   >
   > src/hotspot/share/runtime/objectMonitor.cpp
   >
   > Can you please explain the changes to ObjectMonitor::wait:
   >
   > !   _recursions = save  // restore the old recursion count
   > ! + jt->get_and_reset_relock_count_after_wait(); //
   > increased by the deferred relock count
   >
   > what is the "deferred relock count"? I gather it relates to
   >
   > "The code was extended to be able to deoptimize objects of a 
frame that
   > is not the top frame and to let another thread than the owning 
thread do

   > it."

Yes, these relate. Currently EA based optimizations are reverted, when 
a compiled frame is replaced
with corresponding interpreter frames. Part of this is relocking 
objects with eliminated
locking. New with the enhancement is that we do this also just before 
object references are acquired
through JVMTI. In this case we deoptimize also the owning compiled 
frame C and we register
deoptimized objects as deferred updates. When control returns to C it 
gets deoptimized, we notice
that objects are already deoptimized (reallocated and relocked), so we 
don't do it again (relocking
twice would be incorrect of course). Deferred updates are copied into 
the new interpreter frames.


Problem: relocking is not possible if the target thread T is waiting 
on the monitor that needs to be
relocked. This happens only with non-local objects with 
EliminateNestedLocks. Instead relocking is
deferred until T owns the monitor again. This is what the piece of 
code above does.


Sorry I need some more detail here. How can you wait() on an object 
monitor if the object allocation and/or locking was optimised away? And 
what is a "non-local object" in this context? Isn't EA restricted to 
thread-confined objects?


Is it just that some of the locking gets optimized away e.g.

synchronised(obj) {
   synchronised(obj) {
     synchronised(obj) {
   obj.wait();
     }
   }
}

If this is reduced to a form as-if it were a single lock of the monitor 
(due to EA) and the wait() triggers a JVM TI event which leads to the 
escape of "obj" then we need to reconstruct the true lock state, and so 
when the wait() internally unblocks and reacquires the monitor it has to 
set the true recursion count to 3, not the 1 that it appeared to be when 
wait() was initially called. Is that the scenario?


If so I find this truly awful. Anyone using wait() in a realistic form 
requires a notification and so the object cannot be thread confined. In 
which case I would strongly argue that upon hitting the wait() the deopt 
should occur unconditionally and so the lock state is correct before we 
wait and so we don't need to mess with the recursion count internally 
when we reacquire the monitor.




   > which I don't like the sound of at all when it comes to 
ObjectMonitor
   > state. So I'd like to understand in detail exactly what is going 
on here

   > and why.  This is a very intrusive change that seems to badly break
   > encapsulation and impacts future changes to ObjectMonitor that 
are under

   > investigation.

I would not regard this as breaking encapsulation. Certainly not badly.

I've added a property relock_count_after_wait to JavaThread. The 
property is well
encapsulated. Future ObjectMonitor implementations have to deal with 
recursion too. They are free in
choosing a way to do that as long as that property is taken into 
account. This is hardly a

limitation.


I do think this badly breaks encapsulation as you have to add a callout 
from the guts of the ObjectMonitor code to reach into the thread to get 
this lock count adjustment. I understand why you have had to do this but 
I would much rather see a change to the EA optimisation strategy so that 
this is not needed.


Note also that the property is a straight forward extension of the 
existing concept of deferred
local updates. It is embedded into the structure holding them. So not 
even the footprint of a

JavaThread is enlarged if no deferred updates are generated.

   > ---
   >
   > src/hotspot/share/runtime/thread.cpp
   >
   > Can you please explain why 
JavaThread::wait_for_object_deoptimization
   > has to be handcrafted in this way rather than using proper 
transitions.

   >

I wrote wait_for_object_deoptimization taking 
JavaThread::java_suspend_self_with_safepoint_check

as template. So in short: for the same reasons :)

Threads reach both methods as part of thread state transitions, 
therefore special handling is

required to change thread state on top of ongoing transitions.

   > We got rid of "deopt suspend" some time ago and it is disturbing 
to see
   > it being added back (effectively). This seems like it may be 
something

   > that handshakes could be used for.

Deopt suspend used to be something rather different with a similar 
name[1]. It is not 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-16 Thread Robbin Ehn

Hi Richard,

On 2019-12-16 14:41, Reingruber, Richard wrote:

Hi Robbin,

first of all: thanks a lot for providing feedback. I do appreciate it.

I am absolutely willing to move this to handshakes. Only I still can't see how 
to achieve it.

Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? 
[1]

I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() 
where calling_thread
T1 would apply it on another thread T2.


Sorry I don't immediately see what issue there is in doing a handshake 
instead of:
VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, 
owned_monitors_list);




1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot 
move anymore?

2. Handshakes between two threads are synchronous, correct? If so, then T1 will 
block handshaking
T2, because either T2 or the VMThread will block in L10.


Yes, sorry, I forgot/confused myself about asynch handshake.
(I have a test prototype for that, which removes suspend flag)



I cannot figure out, how you mean this. Only if a helper thread H would 
handshake T2 then T1 could
continue and call wait_until_eb_stopped(). But returning from there T1 would 
block if reallocating
objects triggers GC or attempting to execute the vm operation in
JvmtiEnv::GetOwnedMonitorStackDepthInfo().

It might be impossible to replace my suspend flag with handshakes that are 
available today, because
if it was you could replace all the suspend flags right away, couldn't you?


So adding asynch handshakes and a per thread handshake queue, we can. 
(which this test prototype does)

The issue I'm thinking of is if we need selective polling first.
Suspend flags are not checked in every transition, e.g. vm->native.
A JVM TI agent don't expect to suspend it's own thread when suspending
all threads.
(that thread would be suspended when trying to get back to agent code
when it does vm->native transition)



Or I'm simply missing something... quite possible... :)


No I think you got it right.

Thanks, Robbin



Thanks, Richard.

[1] Drafted by Robbin (thanks!)

  1 class EscapeBarrierSuspendHandshake : public HandshakeClosure {
  2   Semaphore _is_waiting;
  3   Semaphore _wait;
  4   bool _started;
  5 public:
  6   EscapeBarrierSuspendHandshake() : 
HandshakeClosure("EscapeBarrierSuspend"),
  7   _wait(0), _started(false) { }
  8   void do_thread(Thread* th) {
  9 _is_waiting.signal();
 10 _wait.wait();
 11 Atomic::store(&_started, true);
 12   }
 13   void wait_until_eb_stopped() { _is_waiting.wait(); }
 14   void start_thread() {
 15 _wait.signal();
 16 while(!Atomic::load(&_started)) {
 17   os::naked_yield();
 18 }
 19   }
 20 };

-Original Message-
From: Robbin Ehn 
Sent: Montag, 16. Dezember 2019 11:21
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard, as mentioned it would be better if you could do this with
handshakes, instead of using _suspend_flag (since they are going away).
But I can't think of a way doing it without blocking safepoints, so we need to
add some more features in handshakes first.
When possible I hope you are willing to move this code to handshakes instead.

You could stop one thread with, e.g.:
class EscapeBarrierSuspendHandshake : public HandshakeClosure {
Semaphore _is_waiting;
Semaphore _wait;
bool _started;
   public:
EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
_wait(0), _started(false) { }
void do_thread(Thread* th) {
  _is_waiting.signal();
  _wait.wait();
  Atomic::store(&_started, true);
}
void wait_until_eb_stopped() { _is_waiting.wait(); }
void start_thread() {
  _wait.signal();
  while(!Atomic::load(&_started)) {
os::naked_yield();
  }
}
};

But it would block safepoints.

Thanks, Robbin

On 12/10/19 10:45 PM, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
(thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape 
analysis even if JVMTI
agents request capabilities that allow them to access local variable values. 
E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
analysis is disabled right
from the beginning, well before 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-16 Thread Reingruber, Richard
Hi Robbin,

first of all: thanks a lot for providing feedback. I do appreciate it.

I am absolutely willing to move this to handshakes. Only I still can't see how 
to achieve it.

Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? 
[1]

I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() 
where calling_thread
T1 would apply it on another thread T2.

1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot 
move anymore?

2. Handshakes between two threads are synchronous, correct? If so, then T1 will 
block handshaking
   T2, because either T2 or the VMThread will block in L10.

I cannot figure out, how you mean this. Only if a helper thread H would 
handshake T2 then T1 could
continue and call wait_until_eb_stopped(). But returning from there T1 would 
block if reallocating
objects triggers GC or attempting to execute the vm operation in
JvmtiEnv::GetOwnedMonitorStackDepthInfo().

It might be impossible to replace my suspend flag with handshakes that are 
available today, because
if it was you could replace all the suspend flags right away, couldn't you?

Or I'm simply missing something... quite possible... :)

Thanks, Richard.

[1] Drafted by Robbin (thanks!)

 1  class EscapeBarrierSuspendHandshake : public HandshakeClosure {
 2Semaphore _is_waiting;
 3Semaphore _wait;
 4bool _started;
 5  public:
 6EscapeBarrierSuspendHandshake() : 
HandshakeClosure("EscapeBarrierSuspend"),
 7_wait(0), _started(false) { }
 8void do_thread(Thread* th) {
 9  _is_waiting.signal();
10  _wait.wait();
11  Atomic::store(&_started, true);
12}
13void wait_until_eb_stopped() { _is_waiting.wait(); }
14void start_thread() {
15  _wait.signal();
16  while(!Atomic::load(&_started)) {
17os::naked_yield();
18  }
19}
20  };

-Original Message-
From: Robbin Ehn  
Sent: Montag, 16. Dezember 2019 11:21
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard, as mentioned it would be better if you could do this with
handshakes, instead of using _suspend_flag (since they are going away).
But I can't think of a way doing it without blocking safepoints, so we need to
add some more features in handshakes first.
When possible I hope you are willing to move this code to handshakes instead.

You could stop one thread with, e.g.:
class EscapeBarrierSuspendHandshake : public HandshakeClosure {
   Semaphore _is_waiting;
   Semaphore _wait;
   bool _started;
  public:
   EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), 
_wait(0), _started(false) { }
   void do_thread(Thread* th) {
 _is_waiting.signal();
 _wait.wait();
 Atomic::store(&_started, true);
   }
   void wait_until_eb_stopped() { _is_waiting.wait(); }
   void start_thread() {
 _wait.signal();
 while(!Atomic::load(&_started)) {
   os::naked_yield();
 }
   }
};

But it would block safepoints.

Thanks, Robbin

On 12/10/19 10:45 PM, Reingruber, Richard wrote:
> Hi,
> 
> I would like to get reviews please for
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> 
> Corresponding RFE:
> https://bugs.openjdk.java.net/browse/JDK-8227745
> 
> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
> 
> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
> (thanks!). In addition the
> change is being tested at SAP since I posted the first RFR some months ago.
> 
> The intention of this enhancement is to benefit performance wise from escape 
> analysis even if JVMTI
> agents request capabilities that allow them to access local variable values. 
> E.g. if you start-up
> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
> analysis is disabled right
> from the beginning, well before a debugger attaches -- if ever one should do 
> so. With the
> enhancement, escape analysis will remain enabled until and after a debugger 
> attaches. EA based
> optimizations are reverted just before an agent acquires the reference to an 
> object. In the JBS item
> you'll find more details.
> 
> Thanks,
> Richard.
> 
> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>  
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
> 


Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-16 Thread Robbin Ehn

Hi Richard, as mentioned it would be better if you could do this with
handshakes, instead of using _suspend_flag (since they are going away).
But I can't think of a way doing it without blocking safepoints, so we need to
add some more features in handshakes first.
When possible I hope you are willing to move this code to handshakes instead.

You could stop one thread with, e.g.:
class EscapeBarrierSuspendHandshake : public HandshakeClosure {
  Semaphore _is_waiting;
  Semaphore _wait;
  bool _started;
 public:
  EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), 
_wait(0), _started(false) { }

  void do_thread(Thread* th) {
_is_waiting.signal();
_wait.wait();
Atomic::store(&_started, true);
  }
  void wait_until_eb_stopped() { _is_waiting.wait(); }
  void start_thread() {
_wait.signal();
while(!Atomic::load(&_started)) {
  os::naked_yield();
}
  }
};

But it would block safepoints.

Thanks, Robbin

On 12/10/19 10:45 PM, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
(thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape 
analysis even if JVMTI
agents request capabilities that allow them to access local variable values. 
E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do 
so. With the
enhancement, escape analysis will remain enabled until and after a debugger 
attaches. EA based
optimizations are reverted just before an agent acquires the reference to an 
object. In the JBS item
you'll find more details.

Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
 http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch



RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-13 Thread Reingruber, Richard
Hi David,

  > Some further queries/concerns:
  > 
  > src/hotspot/share/runtime/objectMonitor.cpp
  > 
  > Can you please explain the changes to ObjectMonitor::wait:
  > 
  > !   _recursions = save  // restore the old recursion count
  > ! + jt->get_and_reset_relock_count_after_wait(); //
  > increased by the deferred relock count
  > 
  > what is the "deferred relock count"? I gather it relates to
  > 
  > "The code was extended to be able to deoptimize objects of a frame that
  > is not the top frame and to let another thread than the owning thread do
  > it."

Yes, these relate. Currently EA based optimizations are reverted, when a 
compiled frame is replaced
with corresponding interpreter frames. Part of this is relocking objects with 
eliminated
locking. New with the enhancement is that we do this also just before object 
references are acquired
through JVMTI. In this case we deoptimize also the owning compiled frame C and 
we register
deoptimized objects as deferred updates. When control returns to C it gets 
deoptimized, we notice
that objects are already deoptimized (reallocated and relocked), so we don't do 
it again (relocking
twice would be incorrect of course). Deferred updates are copied into the new 
interpreter frames.

Problem: relocking is not possible if the target thread T is waiting on the 
monitor that needs to be
relocked. This happens only with non-local objects with EliminateNestedLocks. 
Instead relocking is
deferred until T owns the monitor again. This is what the piece of code above 
does.

  > which I don't like the sound of at all when it comes to ObjectMonitor
  > state. So I'd like to understand in detail exactly what is going on here
  > and why.  This is a very intrusive change that seems to badly break
  > encapsulation and impacts future changes to ObjectMonitor that are under
  > investigation.

I would not regard this as breaking encapsulation. Certainly not badly.

I've added a property relock_count_after_wait to JavaThread. The property is 
well
encapsulated. Future ObjectMonitor implementations have to deal with recursion 
too. They are free in
choosing a way to do that as long as that property is taken into account. This 
is hardly a
limitation.

Note also that the property is a straight forward extension of the existing 
concept of deferred
local updates. It is embedded into the structure holding them. So not even the 
footprint of a
JavaThread is enlarged if no deferred updates are generated.

  > ---
  > 
  > src/hotspot/share/runtime/thread.cpp
  > 
  > Can you please explain why JavaThread::wait_for_object_deoptimization
  > has to be handcrafted in this way rather than using proper transitions.
  > 

I wrote wait_for_object_deoptimization taking 
JavaThread::java_suspend_self_with_safepoint_check
as template. So in short: for the same reasons :)

Threads reach both methods as part of thread state transitions, therefore 
special handling is
required to change thread state on top of ongoing transitions.

  > We got rid of "deopt suspend" some time ago and it is disturbing to see
  > it being added back (effectively). This seems like it may be something
  > that handshakes could be used for.

Deopt suspend used to be something rather different with a similar name[1]. It 
is not being added back.

I'm actually duplicating the existing external suspend mechanism, because a 
thread can be suspended
at most once. And hey, and don't like that either! But it seems not unlikely 
that the duplicate can
be removed together with the original and the new type of handshakes that will 
be used for
thread suspend can be used for object deoptimization too. See today's 
discussion in JDK-8227745 [2].

Thanks, Richard.

[1] Deopt suspend was something like an async. handshake for architectures with 
register windows,
where patching the return pc for deoptimization of a compiled frame was 
racy if the owner thread
was in native code. Instead a "deopt" suspend flag was set on which the 
thread patched its own
frame upon return from native. So no thread was suspended. It got its name 
only from the name of
the flags.

[2] Discussion about using handshakes to sync. with the target thread:

https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727

-Original Message-
From: David Holmes  
Sent: Freitag, 13. Dezember 2019 00:56
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

Some further queries/concerns:

src/hotspot/share/runtime/objectMonitor.cpp

Can you please explain the changes to ObjectMonitor::wait:

!   _recursions = save  // restore the old recursion count
! + 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-13 Thread Reingruber, Richard
Hi David, Vladimir,

The tests are very targeted and customized towards the issues they solve. IMHO 
they should be run in
the configuration they are tailored for, but as I said, I'm ok with removing 
the tiered
options/conditions.

The enhancement should be covered also by existing JVMTI, JDI, JDWP tests, 
assuming they are also
executed with Xcomp.

If running the tests with Graal as C2 replacement you'll get failures, because 
the JVMCI compiler
does not provide the debug info required at runtime (see 
compiledVFrame::not_global_escape_in_scope()
and compiledVFrame::arg_escape). Still it would be possible to change the tests 
to expect these
failures when executed with Graal. Perhaps I should do this?

Thanks, Richard.

-Original Message-
From: David Holmes  
Sent: Freitag, 13. Dezember 2019 02:53
To: Vladimir Kozlov ; Reingruber, Richard 
; hotspot-runtime-...@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; serviceability-dev@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

On 13/12/2019 10:56 am, Vladimir Kozlov wrote:
> Yes, David
> 
> You are correct these changes touch all part of VM and may affect Graal 
> (which also has EA) too.
> Changes should be tested in all our modes: tiered, C1 only, Graal, 
> Interpreter. And I realized that I only ran tier3-graal testing so I 
> submitted the rest of Graal's tiers now.
> 
> I had assumed that our current testing (I ran all from tier1 to tier8) 
> should exercise all paths in VM these changes touch. But I may be wrong 
> and it is correct to ask author to add testing in all VM modes to make 
> sure new code in VM's runtime and JVMTI is tested.

It may be that our existing JVM TI tests will exercise this adequately 
and that the new tests are more "whitebox" testing than general 
functional tests. But it is not obvious to me that we do have the 
coverage we need.

Cheers,
David

> I do like to keep what current test is doing with C2. May be add an 
> other test for other modes or modify current one to enable to run it in 
> other modes.
> 
> Thanks,
> Vladimir
> 
> On 12/12/19 3:32 PM, David Holmes wrote:
>> On 13/12/2019 9:02 am, Reingruber, Richard wrote:
>>> Hello Vladimir,
>>>
>>> thanks for having a look.
>>>
>>>    > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to 
>>> skip
>>>    > test from running in Interpreter mode too.
>>>
>>> Done.
>>>
>>>    > You don't need vm.opt.TieredCompilation != true in @requires 
>>> because you
>>>    > specified -XX:-TieredCompilation in @run command.
>>>
>>> Ok.
>>>
>>>    > The test is specifically written for C2 only (not for C1 or 
>>> Graal) to
>>>    > verify its Escape Analysis optimization.
>>>    > I did not look in great details into test's code but its 
>>> analysis may be
>>>    > affected if C1 compiler is also used.
>>>    >
>>>    > Richard may clarify this.
>>>
>>> The test cases aim to get their testmethod 'dontinline_testMethod' 
>>> compiled by C2. If they get C1
>>> compiled before doesn't matter all that much. I've got a slight 
>>> preference to disabled tiered
>>> compilation for simplicity.
>>
>> My concern - perhaps unfounded - is that this seems to be being tested 
>> only in a pure C2 environment when the actual changes will have to 
>> operate correctly in a tiered environment (and JVMCI).
>>
>> Thanks,
>> David
>>
>>> Thanks, Richard.
>>>
>>> -Original Message-
>>> From: Vladimir Kozlov 
>>> Sent: Donnerstag, 12. Dezember 2019 19:20
>>> To: David Holmes ; 
>>> hotspot-runtime-...@openjdk.java.net; 
>>> hotspot-compiler-...@openjdk.java.net; 
>>> serviceability-dev@openjdk.java.net; Reingruber, Richard 
>>> 
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi David,
>>>
>>> Tiered is disabled because we don't want to see compilations and outputs
>>> from C1 compiler which does not have EA.
>>>
>>> The test is specifically written for C2 only (not for C1 or Graal) to
>>> verify its Escape Analysis optimization.
>>> I did not look in great details into test's code but its analysis may be
>>> affected if C1 compiler is also used.
>>>
>>> Richard may clarify this.
>>>
>>> thanks,
>>> Vladimir
>>>
>>> On 12/11/19 1:04 PM, David Holmes wrote:
 On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
> I will do full review later. I want to comment about test command 
> line.
>
> You don't need vm.opt.TieredCompilation != true in @requires because
> you specified -XX:-TieredCompilation in @run command.

 And per my comment this should be being tested with tiered as well.

 David

> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
> test from running in Interpreter mode too.
>
> Thanks,
> Vladimir
>
> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>> Hi David,
>>
>>     > Most of the details here are in areas I 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-12 Thread David Holmes

On 13/12/2019 10:56 am, Vladimir Kozlov wrote:

Yes, David

You are correct these changes touch all part of VM and may affect Graal 
(which also has EA) too.
Changes should be tested in all our modes: tiered, C1 only, Graal, 
Interpreter. And I realized that I only ran tier3-graal testing so I 
submitted the rest of Graal's tiers now.


I had assumed that our current testing (I ran all from tier1 to tier8) 
should exercise all paths in VM these changes touch. But I may be wrong 
and it is correct to ask author to add testing in all VM modes to make 
sure new code in VM's runtime and JVMTI is tested.


It may be that our existing JVM TI tests will exercise this adequately 
and that the new tests are more "whitebox" testing than general 
functional tests. But it is not obvious to me that we do have the 
coverage we need.


Cheers,
David

I do like to keep what current test is doing with C2. May be add an 
other test for other modes or modify current one to enable to run it in 
other modes.


Thanks,
Vladimir

On 12/12/19 3:32 PM, David Holmes wrote:

On 13/12/2019 9:02 am, Reingruber, Richard wrote:

Hello Vladimir,

thanks for having a look.

   > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to 
skip

   > test from running in Interpreter mode too.

Done.

   > You don't need vm.opt.TieredCompilation != true in @requires 
because you

   > specified -XX:-TieredCompilation in @run command.

Ok.

   > The test is specifically written for C2 only (not for C1 or 
Graal) to

   > verify its Escape Analysis optimization.
   > I did not look in great details into test's code but its 
analysis may be

   > affected if C1 compiler is also used.
   >
   > Richard may clarify this.

The test cases aim to get their testmethod 'dontinline_testMethod' 
compiled by C2. If they get C1
compiled before doesn't matter all that much. I've got a slight 
preference to disabled tiered

compilation for simplicity.


My concern - perhaps unfounded - is that this seems to be being tested 
only in a pure C2 environment when the actual changes will have to 
operate correctly in a tiered environment (and JVMCI).


Thanks,
David


Thanks, Richard.

-Original Message-
From: Vladimir Kozlov 
Sent: Donnerstag, 12. Dezember 2019 19:20
To: David Holmes ; 
hotspot-runtime-...@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; 
serviceability-dev@openjdk.java.net; Reingruber, Richard 

Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
Performance in the Presence of JVMTI Agents


Hi David,

Tiered is disabled because we don't want to see compilations and outputs
from C1 compiler which does not have EA.

The test is specifically written for C2 only (not for C1 or Graal) to
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be
affected if C1 compiler is also used.

Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:

On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
I will do full review later. I want to comment about test command 
line.


You don't need vm.opt.TieredCompilation != true in @requires because
you specified -XX:-TieredCompilation in @run command.


And per my comment this should be being tested with tiered as well.

David


Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
test from running in Interpreter mode too.

Thanks,
Vladimir

On 12/11/19 7:07 AM, Reingruber, Richard wrote:

Hi David,

    > Most of the details here are in areas I can comment on in
detail, but I
    > did take an initial general look at things.

Thanks for taking the time!

    > The only thing that jumped out at me is that I think the
    > DeoptimizeObjectsALotThread should be a hidden thread.
    >
    > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

    > Also I don't see any testing of the 
DeoptimizeObjectsALotThread.

Without
    > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger
workload. I will add a minimal test
to keep it fresh.

    > Also on the tests I don't understand your @requires clause:
    >
    >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
    > (vm.opt.TieredCompilation != true))
    >
    > This seems to require that TieredCompilation is disabled, but
tiered is
    > our normal mode of operation. ??
    >

I removed the clause. I guess I wanted to target the tests towards
the code they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and
with just one compiler thread.

Additionally I will make use of
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.

Thanks,
Richard.

-Original Message-
From: David Holmes 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ;
serviceability-dev@openjdk.java.net;
hotspot-compiler-...@openjdk.java.net;
hotspot-runtime-...@openjdk.java.net
Subject: 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-12 Thread Vladimir Kozlov

Yes, David

You are correct these changes touch all part of VM and may affect Graal (which 
also has EA) too.
Changes should be tested in all our modes: tiered, C1 only, Graal, Interpreter. And I realized that 
I only ran tier3-graal testing so I submitted the rest of Graal's tiers now.


I had assumed that our current testing (I ran all from tier1 to tier8) should exercise all paths in 
VM these changes touch. But I may be wrong and it is correct to ask author to add testing in all VM 
modes to make sure new code in VM's runtime and JVMTI is tested.


I do like to keep what current test is doing with C2. May be add an other test for other modes or 
modify current one to enable to run it in other modes.


Thanks,
Vladimir

On 12/12/19 3:32 PM, David Holmes wrote:

On 13/12/2019 9:02 am, Reingruber, Richard wrote:

Hello Vladimir,

thanks for having a look.

   > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
   > test from running in Interpreter mode too.

Done.

   > You don't need vm.opt.TieredCompilation != true in @requires because you
   > specified -XX:-TieredCompilation in @run command.

Ok.

   > The test is specifically written for C2 only (not for C1 or Graal) to
   > verify its Escape Analysis optimization.
   > I did not look in great details into test's code but its analysis may be
   > affected if C1 compiler is also used.
   >
   > Richard may clarify this.

The test cases aim to get their testmethod 'dontinline_testMethod' compiled by 
C2. If they get C1
compiled before doesn't matter all that much. I've got a slight preference to 
disabled tiered
compilation for simplicity.


My concern - perhaps unfounded - is that this seems to be being tested only in a pure C2 environment 
when the actual changes will have to operate correctly in a tiered environment (and JVMCI).


Thanks,
David


Thanks, Richard.

-Original Message-
From: Vladimir Kozlov 
Sent: Donnerstag, 12. Dezember 2019 19:20
To: David Holmes ; hotspot-runtime-...@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; serviceability-dev@openjdk.java.net; Reingruber, Richard 

Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of 
JVMTI Agents


Hi David,

Tiered is disabled because we don't want to see compilations and outputs
from C1 compiler which does not have EA.

The test is specifically written for C2 only (not for C1 or Graal) to
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be
affected if C1 compiler is also used.

Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:

On 12/12/2019 5:21 am, Vladimir Kozlov wrote:

I will do full review later. I want to comment about test command line.

You don't need vm.opt.TieredCompilation != true in @requires because
you specified -XX:-TieredCompilation in @run command.


And per my comment this should be being tested with tiered as well.

David


Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
test from running in Interpreter mode too.

Thanks,
Vladimir

On 12/11/19 7:07 AM, Reingruber, Richard wrote:

Hi David,

    > Most of the details here are in areas I can comment on in
detail, but I
    > did take an initial general look at things.

Thanks for taking the time!

    > The only thing that jumped out at me is that I think the
    > DeoptimizeObjectsALotThread should be a hidden thread.
    >
    > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

    > Also I don't see any testing of the DeoptimizeObjectsALotThread.
Without
    > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger
workload. I will add a minimal test
to keep it fresh.

    > Also on the tests I don't understand your @requires clause:
    >
    >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
    > (vm.opt.TieredCompilation != true))
    >
    > This seems to require that TieredCompilation is disabled, but
tiered is
    > our normal mode of operation. ??
    >

I removed the clause. I guess I wanted to target the tests towards
the code they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and
with just one compiler thread.

Additionally I will make use of
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.

Thanks,
Richard.

-Original Message-
From: David Holmes 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ;
serviceability-dev@openjdk.java.net;
hotspot-compiler-...@openjdk.java.net;
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
Performance in the Presence of JVMTI Agents

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-12 Thread David Holmes

Hi Richard,

Some further queries/concerns:

src/hotspot/share/runtime/objectMonitor.cpp

Can you please explain the changes to ObjectMonitor::wait:

!   _recursions = save  // restore the old recursion count
! + jt->get_and_reset_relock_count_after_wait(); // 
increased by the deferred relock count


what is the "deferred relock count"? I gather it relates to

"The code was extended to be able to deoptimize objects of a frame that 
is not the top frame and to let another thread than the owning thread do 
it."


which I don't like the sound of at all when it comes to ObjectMonitor 
state. So I'd like to understand in detail exactly what is going on here 
and why.  This is a very intrusive change that seems to badly break 
encapsulation and impacts future changes to ObjectMonitor that are under 
investigation.


---

src/hotspot/share/runtime/thread.cpp

Can you please explain why JavaThread::wait_for_object_deoptimization 
has to be handcrafted in this way rather than using proper transitions.


We got rid of "deopt suspend" some time ago and it is disturbing to see 
it being added back (effectively). This seems like it may be something 
that handshakes could be used for.


Thanks,
David
-

On 12/12/2019 7:02 am, David Holmes wrote:

On 12/12/2019 1:07 am, Reingruber, Richard wrote:

Hi David,

   > Most of the details here are in areas I can comment on in detail, 
but I

   > did take an initial general look at things.

Thanks for taking the time!


Apologies the above should read:

"Most of the details here are in areas I *can't* comment on in detail ..."

David


   > The only thing that jumped out at me is that I think the
   > DeoptimizeObjectsALotThread should be a hidden thread.
   >
   > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

   > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
Without

   > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger 
workload. I will add a minimal test

to keep it fresh.

   > Also on the tests I don't understand your @requires clause:
   >
   >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
   > (vm.opt.TieredCompilation != true))
   >
   > This seems to require that TieredCompilation is disabled, but 
tiered is

   > our normal mode of operation. ??
   >

I removed the clause. I guess I wanted to target the tests towards the 
code they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and 
with just one compiler thread.


Additionally I will make use of 
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.


Thanks,
Richard.

-Original Message-
From: David Holmes 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
Performance in the Presence of JVMTI Agents


Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
issues (thanks!). In addition the
change is being tested at SAP since I posted the first RFR some 
months ago.


The intention of this enhancement is to benefit performance wise from 
escape analysis even if JVMTI
agents request capabilities that allow them to access local variable 
values. E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
escape analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one 
should do so. With the
enhancement, escape analysis will remain enabled until and after a 
debugger attaches. EA based
optimizations are reverted just before an agent acquires the 
reference to an object. In the JBS item

you'll find more details.


Most of the details here are in areas I can comment on in detail, but I
did take an initial general look at things.

The only thing that jumped out at me is that I think the
DeoptimizeObjectsALotThread should be a hidden thread.

+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
active testing this will just bit-rot.

Also on the tests I don't understand your @requires clause:

   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
(vm.opt.TieredCompilation != true))

This seems to require that TieredCompilation is disabled, but tiered is
our normal mode of operation. ??

Thanks,
David


Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-12 Thread David Holmes

On 13/12/2019 9:02 am, Reingruber, Richard wrote:

Hello Vladimir,

thanks for having a look.

   > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
   > test from running in Interpreter mode too.

Done.

   > You don't need vm.opt.TieredCompilation != true in @requires because you
   > specified -XX:-TieredCompilation in @run command.

Ok.

   > The test is specifically written for C2 only (not for C1 or Graal) to
   > verify its Escape Analysis optimization.
   > I did not look in great details into test's code but its analysis may be
   > affected if C1 compiler is also used.
   >
   > Richard may clarify this.

The test cases aim to get their testmethod 'dontinline_testMethod' compiled by 
C2. If they get C1
compiled before doesn't matter all that much. I've got a slight preference to 
disabled tiered
compilation for simplicity.


My concern - perhaps unfounded - is that this seems to be being tested 
only in a pure C2 environment when the actual changes will have to 
operate correctly in a tiered environment (and JVMCI).


Thanks,
David


Thanks, Richard.

-Original Message-
From: Vladimir Kozlov 
Sent: Donnerstag, 12. Dezember 2019 19:20
To: David Holmes ; hotspot-runtime-...@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; serviceability-dev@openjdk.java.net; Reingruber, 
Richard 
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi David,

Tiered is disabled because we don't want to see compilations and outputs
from C1 compiler which does not have EA.

The test is specifically written for C2 only (not for C1 or Graal) to
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be
affected if C1 compiler is also used.

Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:

On 12/12/2019 5:21 am, Vladimir Kozlov wrote:

I will do full review later. I want to comment about test command line.

You don't need vm.opt.TieredCompilation != true in @requires because
you specified -XX:-TieredCompilation in @run command.


And per my comment this should be being tested with tiered as well.

David


Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
test from running in Interpreter mode too.

Thanks,
Vladimir

On 12/11/19 7:07 AM, Reingruber, Richard wrote:

Hi David,

    > Most of the details here are in areas I can comment on in
detail, but I
    > did take an initial general look at things.

Thanks for taking the time!

    > The only thing that jumped out at me is that I think the
    > DeoptimizeObjectsALotThread should be a hidden thread.
    >
    > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

    > Also I don't see any testing of the DeoptimizeObjectsALotThread.
Without
    > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger
workload. I will add a minimal test
to keep it fresh.

    > Also on the tests I don't understand your @requires clause:
    >
    >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
    > (vm.opt.TieredCompilation != true))
    >
    > This seems to require that TieredCompilation is disabled, but
tiered is
    > our normal mode of operation. ??
    >

I removed the clause. I guess I wanted to target the tests towards
the code they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and
with just one compiler thread.

Additionally I will make use of
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.

Thanks,
Richard.

-Original Message-
From: David Holmes 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ;
serviceability-dev@openjdk.java.net;
hotspot-compiler-...@openjdk.java.net;
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
Performance in the Presence of JVMTI Agents

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
issues (thanks!). In addition the
change is being tested at SAP since I posted the first RFR some
months ago.

The intention of this enhancement is to benefit performance wise
from escape analysis even if JVMTI
agents request capabilities that allow them to access local variable
values. E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
escape analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one
should do so. With the
enhancement, escape analysis will remain 

RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-12 Thread Reingruber, Richard
Hello Vladimir,

thanks for having a look.

  > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
  > test from running in Interpreter mode too.

Done.

  > You don't need vm.opt.TieredCompilation != true in @requires because you
  > specified -XX:-TieredCompilation in @run command.

Ok.

  > The test is specifically written for C2 only (not for C1 or Graal) to
  > verify its Escape Analysis optimization.
  > I did not look in great details into test's code but its analysis may be
  > affected if C1 compiler is also used.
  > 
  > Richard may clarify this.

The test cases aim to get their testmethod 'dontinline_testMethod' compiled by 
C2. If they get C1
compiled before doesn't matter all that much. I've got a slight preference to 
disabled tiered
compilation for simplicity.

Thanks, Richard.

-Original Message-
From: Vladimir Kozlov  
Sent: Donnerstag, 12. Dezember 2019 19:20
To: David Holmes ; 
hotspot-runtime-...@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
serviceability-dev@openjdk.java.net; Reingruber, Richard 

Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi David,

Tiered is disabled because we don't want to see compilations and outputs 
from C1 compiler which does not have EA.

The test is specifically written for C2 only (not for C1 or Graal) to 
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be 
affected if C1 compiler is also used.

Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:
> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>> I will do full review later. I want to comment about test command line.
>>
>> You don't need vm.opt.TieredCompilation != true in @requires because 
>> you specified -XX:-TieredCompilation in @run command.
> 
> And per my comment this should be being tested with tiered as well.
> 
> David
> 
>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip 
>> test from running in Interpreter mode too.
>>
>> Thanks,
>> Vladimir
>>
>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>>    > Most of the details here are in areas I can comment on in 
>>> detail, but I
>>>    > did take an initial general look at things.
>>>
>>> Thanks for taking the time!
>>>
>>>    > The only thing that jumped out at me is that I think the
>>>    > DeoptimizeObjectsALotThread should be a hidden thread.
>>>    >
>>>    > +  bool is_hidden_from_external_view() const { return true; }
>>>
>>> Yes, it should. Will add the method like above.
>>>
>>>    > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>> Without
>>>    > active testing this will just bit-rot.
>>>
>>> DeoptimizeObjectsALot is meant for stress testing with a larger 
>>> workload. I will add a minimal test
>>> to keep it fresh.
>>>
>>>    > Also on the tests I don't understand your @requires clause:
>>>    >
>>>    >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>    > (vm.opt.TieredCompilation != true))
>>>    >
>>>    > This seems to require that TieredCompilation is disabled, but 
>>> tiered is
>>>    > our normal mode of operation. ??
>>>    >
>>>
>>> I removed the clause. I guess I wanted to target the tests towards 
>>> the code they are supposed to
>>> test, and it's easier to analyze failures w/o tiered compilation and 
>>> with just one compiler thread.
>>>
>>> Additionally I will make use of 
>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>
>>> Thanks,
>>> Richard.
>>>
>>> -Original Message-
>>> From: David Holmes 
>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>> To: Reingruber, Richard ; 
>>> serviceability-dev@openjdk.java.net; 
>>> hotspot-compiler-...@openjdk.java.net; 
>>> hotspot-runtime-...@openjdk.java.net
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi Richard,
>>>
>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
 Hi,

 I would like to get reviews please for

 http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

 Corresponding RFE:
 https://bugs.openjdk.java.net/browse/JDK-8227745

 Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
 And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

 Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
 issues (thanks!). In addition the
 change is being tested at SAP since I posted the first RFR some 
 months ago.

 The intention of this enhancement is to benefit performance wise 
 from escape analysis even if JVMTI
 agents request capabilities that allow them to access local variable 
 values. E.g. if you start-up
 with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
 escape analysis is disabled right
 from the beginning, well before a 

Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-12 Thread Vladimir Kozlov

Hi David,

Tiered is disabled because we don't want to see compilations and outputs 
from C1 compiler which does not have EA.


The test is specifically written for C2 only (not for C1 or Graal) to 
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be 
affected if C1 compiler is also used.


Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:

On 12/12/2019 5:21 am, Vladimir Kozlov wrote:

I will do full review later. I want to comment about test command line.

You don't need vm.opt.TieredCompilation != true in @requires because 
you specified -XX:-TieredCompilation in @run command.


And per my comment this should be being tested with tiered as well.

David

Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip 
test from running in Interpreter mode too.


Thanks,
Vladimir

On 12/11/19 7:07 AM, Reingruber, Richard wrote:

Hi David,

   > Most of the details here are in areas I can comment on in 
detail, but I

   > did take an initial general look at things.

Thanks for taking the time!

   > The only thing that jumped out at me is that I think the
   > DeoptimizeObjectsALotThread should be a hidden thread.
   >
   > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

   > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
Without

   > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger 
workload. I will add a minimal test

to keep it fresh.

   > Also on the tests I don't understand your @requires clause:
   >
   >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
   > (vm.opt.TieredCompilation != true))
   >
   > This seems to require that TieredCompilation is disabled, but 
tiered is

   > our normal mode of operation. ??
   >

I removed the clause. I guess I wanted to target the tests towards 
the code they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and 
with just one compiler thread.


Additionally I will make use of 
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.


Thanks,
Richard.

-Original Message-
From: David Holmes 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; 
hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
Performance in the Presence of JVMTI Agents


Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
issues (thanks!). In addition the
change is being tested at SAP since I posted the first RFR some 
months ago.


The intention of this enhancement is to benefit performance wise 
from escape analysis even if JVMTI
agents request capabilities that allow them to access local variable 
values. E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
escape analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one 
should do so. With the
enhancement, escape analysis will remain enabled until and after a 
debugger attaches. EA based
optimizations are reverted just before an agent acquires the 
reference to an object. In the JBS item

you'll find more details.


Most of the details here are in areas I can comment on in detail, but I
did take an initial general look at things.

The only thing that jumped out at me is that I think the
DeoptimizeObjectsALotThread should be a hidden thread.

+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
active testing this will just bit-rot.

Also on the tests I don't understand your @requires clause:

   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
(vm.opt.TieredCompilation != true))

This seems to require that TieredCompilation is disabled, but tiered is
our normal mode of operation. ??

Thanks,
David


Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 





Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-11 Thread David Holmes

On 12/12/2019 1:07 am, Reingruber, Richard wrote:

Hi David,

   > Most of the details here are in areas I can comment on in detail, but I
   > did take an initial general look at things.

Thanks for taking the time!


Apologies the above should read:

"Most of the details here are in areas I *can't* comment on in detail ..."

David


   > The only thing that jumped out at me is that I think the
   > DeoptimizeObjectsALotThread should be a hidden thread.
   >
   > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

   > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
   > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger workload. I 
will add a minimal test
to keep it fresh.

   > Also on the tests I don't understand your @requires clause:
   >
   >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
   > (vm.opt.TieredCompilation != true))
   >
   > This seems to require that TieredCompilation is disabled, but tiered is
   > our normal mode of operation. ??
   >

I removed the clause. I guess I wanted to target the tests towards the code 
they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and with just 
one compiler thread.

Additionally I will make use of 
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.

Thanks,
Richard.

-Original Message-
From: David Holmes 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
(thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape 
analysis even if JVMTI
agents request capabilities that allow them to access local variable values. 
E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do 
so. With the
enhancement, escape analysis will remain enabled until and after a debugger 
attaches. EA based
optimizations are reverted just before an agent acquires the reference to an 
object. In the JBS item
you'll find more details.


Most of the details here are in areas I can comment on in detail, but I
did take an initial general look at things.

The only thing that jumped out at me is that I think the
DeoptimizeObjectsALotThread should be a hidden thread.

+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
active testing this will just bit-rot.

Also on the tests I don't understand your @requires clause:

   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
(vm.opt.TieredCompilation != true))

This seems to require that TieredCompilation is disabled, but tiered is
our normal mode of operation. ??

Thanks,
David


Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
  http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch



RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-11 Thread Reingruber, Richard
Hi David,

  > Most of the details here are in areas I can comment on in detail, but I 
  > did take an initial general look at things.

Thanks for taking the time!

  > The only thing that jumped out at me is that I think the 
  > DeoptimizeObjectsALotThread should be a hidden thread.
  > 
  > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

  > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without 
  > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger workload. I 
will add a minimal test
to keep it fresh.

  > Also on the tests I don't understand your @requires clause:
  > 
  >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & 
  > (vm.opt.TieredCompilation != true))
  > 
  > This seems to require that TieredCompilation is disabled, but tiered is 
  > our normal mode of operation. ??
  > 

I removed the clause. I guess I wanted to target the tests towards the code 
they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and with just 
one compiler thread.

Additionally I will make use of 
compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.

Thanks,
Richard.

-Original Message-
From: David Holmes  
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard ; 
serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; 
hotspot-runtime-...@openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in 
the Presence of JVMTI Agents

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> Hi,
> 
> I would like to get reviews please for
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> 
> Corresponding RFE:
> https://bugs.openjdk.java.net/browse/JDK-8227745
> 
> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
> 
> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
> (thanks!). In addition the
> change is being tested at SAP since I posted the first RFR some months ago.
> 
> The intention of this enhancement is to benefit performance wise from escape 
> analysis even if JVMTI
> agents request capabilities that allow them to access local variable values. 
> E.g. if you start-up
> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
> analysis is disabled right
> from the beginning, well before a debugger attaches -- if ever one should do 
> so. With the
> enhancement, escape analysis will remain enabled until and after a debugger 
> attaches. EA based
> optimizations are reverted just before an agent acquires the reference to an 
> object. In the JBS item
> you'll find more details.

Most of the details here are in areas I can comment on in detail, but I 
did take an initial general look at things.

The only thing that jumped out at me is that I think the 
DeoptimizeObjectsALotThread should be a hidden thread.

+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without 
active testing this will just bit-rot.

Also on the tests I don't understand your @requires clause:

  @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & 
(vm.opt.TieredCompilation != true))

This seems to require that TieredCompilation is disabled, but tiered is 
our normal mode of operation. ??

Thanks,
David

> Thanks,
> Richard.
> 
> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>  
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
> 


Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-10 Thread David Holmes

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
(thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape 
analysis even if JVMTI
agents request capabilities that allow them to access local variable values. 
E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do 
so. With the
enhancement, escape analysis will remain enabled until and after a debugger 
attaches. EA based
optimizations are reverted just before an agent acquires the reference to an 
object. In the JBS item
you'll find more details.


Most of the details here are in areas I can comment on in detail, but I 
did take an initial general look at things.


The only thing that jumped out at me is that I think the 
DeoptimizeObjectsALotThread should be a hidden thread.


+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without 
active testing this will just bit-rot.


Also on the tests I don't understand your @requires clause:

 @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & 
(vm.opt.TieredCompilation != true))


This seems to require that TieredCompilation is disabled, but tiered is 
our normal mode of operation. ??


Thanks,
David


Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
 http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch



RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

2019-12-10 Thread Reingruber, Richard
Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues 
(thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape 
analysis even if JVMTI
agents request capabilities that allow them to access local variable values. 
E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape 
analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do 
so. With the
enhancement, escape analysis will remain enabled until and after a debugger 
attaches. EA based
optimizations are reverted just before an agent acquires the reference to an 
object. In the JBS item
you'll find more details.

Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch