Re: RFR: 8287007: [cgroups] Consistently use stringStream throughout parsing code [v3]

2022-06-12 Thread Ioi Lam
On Wed, 8 Jun 2022 12:09:27 GMT, Severin Gehwolf  wrote:

>> Please review this cleanup change in the cgroup subsystem which used to use 
>> hard-coded stack allocated
>> buffers for concatenating strings in memory. We can use `stringStream` 
>> instead which doesn't have the issue
>> of hard-coding maximum lengths (and related checks) and makes the code, 
>> thus, easier to read.
>> 
>> While at it, I've written basic `gtests` verifying current behaviour (and 
>> the same for the JDK code). From
>> a functionality point of view this is a no-op (modulo the one bug in the 
>> substring case which seems to be
>> there since day one). I'd prefer if we could defer any change in 
>> functionality to a separate bug as this is
>> meant to only use stringStream throughout the cgroups code.
>> 
>> Testing:
>> - [x] Container tests on Linux x86_64 cgroups v1 and cgroups v2
>> - [x] Added tests, which I've verified also pass before the stringStream 
>> change
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Remove fix for substring match

Just came back from vacation. LGTM.

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.org/jdk/pull/8969


Re: RFR: 8287007: [cgroups] Consistently use stringStream throughout parsing code [v2]

2022-06-08 Thread Ioi Lam
On Tue, 7 Jun 2022 12:42:26 GMT, Severin Gehwolf  wrote:

>> Please review this cleanup change in the cgroup subsystem which used to use 
>> hard-coded stack allocated
>> buffers for concatenating strings in memory. We can use `stringStream` 
>> instead which doesn't have the issue
>> of hard-coding maximum lengths (and related checks) and makes the code, 
>> thus, easier to read.
>> 
>> While at it, I've written basic `gtests` verifying current behaviour (and 
>> the same for the JDK code). From
>> a functionality point of view this is a no-op (modulo the one bug in the 
>> substring case which seems to be
>> there since day one). I'd prefer if we could defer any change in 
>> functionality to a separate bug as this is
>> meant to only use stringStream throughout the cgroups code.
>> 
>> Testing:
>> - [x] Container tests on Linux x86_64 cgroups v1 and cgroups v2
>> - [x] Added tests, which I've verified also pass before the stringStream 
>> change
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains six additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into jdk-8287007-string-stream
>  - Add cgroups v2 java test
>  - use stringStream in cgroups v2
>  - Add gtest for cgroups v2 code path
>
>Also fixes the bug when cgroup path is '/'.
>  - 8287007: [cgroups] Consistently use stringStream throughout parsing code
>  - 8287007: [cgroups] Consistently use stringStream throughout parsing code

Changes requested by iklam (Reviewer).

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 54:

> 52:   } else {
> 53: char *p = strstr(cgroup_path, _root);
> 54: if (p != NULL && p == cgroup_path) {

I think this change should be done in a separate bug, because it will cause the 
`if` block to be executed. Previously the `if` block is never executed (unless 
`cgroup_path == _root` ??, but then it will just set `_path` to the same string 
as `_mount_point`) -- so we don't even know if this change of behavior might do 
something harmful.

-

PR: https://git.openjdk.java.net/jdk/pull/8969


Re: RFR: 8287663 Add a regression test for JDK-8287073

2022-06-06 Thread Ioi Lam
On Mon, 6 Jun 2022 23:07:06 GMT, Ioi Lam  wrote:

> We should try to consolidate these test cases to improve maintainability.

I filed [JDK-8287185](https://bugs.openjdk.org/browse/JDK-8287185)

-

PR: https://git.openjdk.java.net/jdk/pull/8993


Re: RFR: 8287663 Add a regression test for JDK-8287073

2022-06-06 Thread Ioi Lam
On Thu, 2 Jun 2022 14:32:28 GMT, Severin Gehwolf  wrote:

> This adds a regression test for a recent fix (JDK-8287073). I've restructured 
> the linux specific JDK code to call a separate static function to enable this 
> test. It'll help future tests too.
> 
> Testing:
> - [x] Container tests continue to pass + GHA
> - [x] New regression test fails prior the code fix of JDK-8287073 and passes 
> with it

Not specific to this PR, but we have a general problem with lots of duplication 
between these two files. E.g.,


https://github.com/openjdk/jdk/blob/a50b06e85124f61b5133189a2a2e461753d5d9e7/test/hotspot/jtreg/containers/cgroup/CgroupSubsystemFactory.java#L132-L143

https://github.com/openjdk/jdk/blob/a50b06e85124f61b5133189a2a2e461753d5d9e7/test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java#L130-L143

We should try to consolidate these test cases to improve maintainability.

-

PR: https://git.openjdk.java.net/jdk/pull/8993


Re: RFR: 8287663 Add a regression test for JDK-8287073

2022-06-06 Thread Ioi Lam
On Thu, 2 Jun 2022 14:32:28 GMT, Severin Gehwolf  wrote:

> This adds a regression test for a recent fix (JDK-8287073). I've restructured 
> the linux specific JDK code to call a separate static function to enable this 
> test. It'll help future tests too.
> 
> Testing:
> - [x] Container tests continue to pass + GHA
> - [x] New regression test fails prior the code fix of JDK-8287073 and passes 
> with it

LGTM.

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8993


Re: RFR: 8287073: NPE from CgroupV2Subsystem.getInstance() [v3]

2022-05-29 Thread Ioi Lam
On Thu, 26 May 2022 16:04:17 GMT, Maxim Kartashev  
wrote:

>> Following the logic from the comment directly above the changed line, since 
>> it doesn't matter which controller we pick, pick any available controller 
>> instead of the one called "memory" specifically. This way we are guarded 
>> against getting `null` as `anyController`, which is being immediately passed 
>> down to `CgroupV2Subsystem.getInstance()` that is unprepared to accept 
>> `null` values. 
>> 
>> It is also worth noting that the previous checks (such as that at line 89) 
>> make sure that there exist at least one controller in the map.
>
> Maxim Kartashev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Made requested changes

I tested the patch in our CI pipeline. All container tests passed.

-

PR: https://git.openjdk.java.net/jdk/pull/8803


Re: RFR: 8287384: Speed up ForceGC

2022-05-26 Thread Ioi Lam
On Thu, 26 May 2022 18:50:07 GMT, Xue-Lei Andrew Fan  wrote:

> Hi,
> 
> May I have this test update reviewed?
> 
> The ForceGC could be enhanced by using smaller wait/sleep time, and shared 
> cleaner.
> 
> Thanks,
> Xuelei

Please rename the RFE title to be less generic. How about "Speed up 
jdk.test.lib.util.ForceGC"?

-

Changes requested by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8907


Re: RFR: 8287073: NPE from CgroupV2Subsystem.getInstance() [v3]

2022-05-26 Thread Ioi Lam
On Thu, 26 May 2022 16:04:17 GMT, Maxim Kartashev  
wrote:

>> Following the logic from the comment directly above the changed line, since 
>> it doesn't matter which controller we pick, pick any available controller 
>> instead of the one called "memory" specifically. This way we are guarded 
>> against getting `null` as `anyController`, which is being immediately passed 
>> down to `CgroupV2Subsystem.getInstance()` that is unprepared to accept 
>> `null` values. 
>> 
>> It is also worth noting that the previous checks (such as that at line 89) 
>> make sure that there exist at least one controller in the map.
>
> Maxim Kartashev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Made requested changes

Marked as reviewed by iklam (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/8803


Re: RFR: 8287073: NPE from CgroupV2Subsystem.getInstance() [v2]

2022-05-26 Thread Ioi Lam
On Thu, 26 May 2022 09:42:22 GMT, Maxim Kartashev  
wrote:

>> Following the logic from the comment directly above the changed line, since 
>> it doesn't matter which controller we pick, pick any available controller 
>> instead of the one called "memory" specifically. This way we are guarded 
>> against getting `null` as `anyController`, which is being immediately passed 
>> down to `CgroupV2Subsystem.getInstance()` that is unprepared to accept 
>> `null` values. 
>> 
>> It is also worth noting that the previous checks (such as that at line 89) 
>> make sure that there exist at least one controller in the map.
>
> Maxim Kartashev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Removed unnecessary null checks

Changes requested by iklam (Reviewer).

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java 
line 113:

> 111: CgroupInfo anyController = infos.values().iterator().next();
> 112: CgroupSubsystem subsystem = 
> CgroupV2Subsystem.getInstance(anyController);
> 113: return new CgroupMetrics(subsystem);

Should add `Objects.requireNonNull(anyController)` and 
`Objects.requireNonNull(subsystem)`.

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java 
line 116:

> 114: } else {
> 115: CgroupV1Subsystem subsystem = 
> CgroupV1Subsystem.getInstance(infos);
> 116: return new CgroupV1MetricsImpl(subsystem);

This shouldn't be changed because the current implementation of 
`CgroupV1Subsystem.getInstance(infos)` has a path that returns null.

Maybe that's impossible, because when we call `CgroupV1Subsystem.getInstance`, 
we must have at least one v1 subsystem in `infos`. However, that's not related 
to this issue.  Please fix that in a separate RFE. For example, 
`CgroupV1Subsystem.getInstance(infos)`  can be changed to throw an exception 
instead if return null.

-

PR: https://git.openjdk.java.net/jdk/pull/8803


Integrated: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller

2022-05-25 Thread Ioi Lam
On Mon, 23 May 2022 22:11:47 GMT, Ioi Lam  wrote:

> This PR fixes a bug found on an Ubuntu host that's mostly running with 
> cgroupv2, but there's a controller (freezer) that is mounted in cgroupv1 mode.
> 
> The container support code in the VM and JDK checks if we have simultaneously 
> mounted v1 and v2 containers. If so, we revert to "hybrid" mode (which is 
> essentially v1).
> 
> However, the "hybrid checks" only considers the following controllers that 
> Java cares about:
> 
> - cpu
> - cpuacct
> - cpuset
> - blkio
> - memory
> - pids
> 
> If only "freezer" is mounted in v1 mode, Java will start in V2 mode. Later, 
> when `setCgroupV2Path()` sees the first line in /proc/self/cgroup, it runs 
> into the assert.
> 
> 
> $ cat /proc/self/cgroup
> 1:freezer:/
> 0::/user.slice/user-1001.slice/session-85.scope
> 
> 
> The fix is simple -- when we get to `setCgroupV2Path()`, we have already 
> decided that the system has not mounted any v1 controllers that we care 
> about, so we should just ignore all the v1 controllers (which has a non-empty 
> name).

This pull request has now been integrated.

Changeset: 704b9a66
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/704b9a66bba0dc8adb62be80fd62864b9c687c3f
Stats: 117 lines in 3 files changed: 114 ins; 0 del; 3 mod

8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller

Reviewed-by: mseledtsov, sgehwolf

-

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v2]

2022-05-25 Thread Ioi Lam
On Tue, 24 May 2022 19:36:57 GMT, Severin Gehwolf  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   @jerboaa comments
>
> Looks good. Thanks for the thorough analysis.

Thanks @jerboaa and @mseledts for the review.

-

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v4]

2022-05-25 Thread Ioi Lam
> This PR fixes a bug found on an Ubuntu host that's mostly running with 
> cgroupv2, but there's a controller (freezer) that is mounted in cgroupv1 mode.
> 
> The container support code in the VM and JDK checks if we have simultaneously 
> mounted v1 and v2 containers. If so, we revert to "hybrid" mode (which is 
> essentially v1).
> 
> However, the "hybrid checks" only considers the following controllers that 
> Java cares about:
> 
> - cpu
> - cpuacct
> - cpuset
> - blkio
> - memory
> - pids
> 
> If only "freezer" is mounted in v1 mode, Java will start in V2 mode. Later, 
> when `setCgroupV2Path()` sees the first line in /proc/self/cgroup, it runs 
> into the assert.
> 
> 
> $ cat /proc/self/cgroup
> 1:freezer:/
> 0::/user.slice/user-1001.slice/session-85.scope
> 
> 
> The fix is simple -- when we get to `setCgroupV2Path()`, we have already 
> decided that the system has not mounted any v1 controllers that we care 
> about, so we should just ignore all the v1 controllers (which has a non-empty 
> name).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  fixed typo in comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8858/files
  - new: https://git.openjdk.java.net/jdk/pull/8858/files/9134182e..a13afe47

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8858.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8858/head:pull/8858

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v3]

2022-05-25 Thread Ioi Lam
On Wed, 25 May 2022 15:50:32 GMT, Severin Gehwolf  wrote:

>> It confused me, fwiw. Anyway up to you. It's not super important.
>
> works for me. +1. Note the typo 
> `anyCgroupsV1Controller/anyCgroupsV2Controller` not **V1** twice.

Oops, I'll fixed that. Thanks!

-

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v3]

2022-05-25 Thread Ioi Lam
> This PR fixes a bug found on an Ubuntu host that's mostly running with 
> cgroupv2, but there's a controller (freezer) that is mounted in cgroupv1 mode.
> 
> The container support code in the VM and JDK checks if we have simultaneously 
> mounted v1 and v2 containers. If so, we revert to "hybrid" mode (which is 
> essentially v1).
> 
> However, the "hybrid checks" only considers the following controllers that 
> Java cares about:
> 
> - cpu
> - cpuacct
> - cpuset
> - blkio
> - memory
> - pids
> 
> If only "freezer" is mounted in v1 mode, Java will start in V2 mode. Later, 
> when `setCgroupV2Path()` sees the first line in /proc/self/cgroup, it runs 
> into the assert.
> 
> 
> $ cat /proc/self/cgroup
> 1:freezer:/
> 0::/user.slice/user-1001.slice/session-85.scope
> 
> 
> The fix is simple -- when we get to `setCgroupV2Path()`, we have already 
> decided that the system has not mounted any v1 controllers that we care 
> about, so we should just ignore all the v1 controllers (which has a non-empty 
> name).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  fixed comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8858/files
  - new: https://git.openjdk.java.net/jdk/pull/8858/files/1f17b6e8..9134182e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=01-02

  Stats: 8 lines in 1 file changed: 4 ins; 4 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8858.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8858/head:pull/8858

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v3]

2022-05-25 Thread Ioi Lam
On Tue, 24 May 2022 19:49:35 GMT, Ioi Lam  wrote:

>> My bad. How about `Intentional incomplete switch. There are ...`? Anyway, 
>> why is the empty `default` case needed other than for the comment?
>
> To me, the `default:` switch is a clear indication that "everything else 
> comes here". So you won't be confused whether the comment is related to the 
> last line just above the comment.

If you don't like the `default:` coding style, how about this:


switch (info.getName()) {
// Only the following controllers are important to Java. All
// other controllers (such as freezer) are ignored and
// are not considered in the checks below for
// anyCgroupsV1Controller/anyCgroupsV1Controller.
case CPU_CTRL:  infos.put(CPU_CTRL, info); break;
case CPUACCT_CTRL:  infos.put(CPUACCT_CTRL, info); break;
case CPUSET_CTRL:   infos.put(CPUSET_CTRL, info); break;
case MEMORY_CTRL:   infos.put(MEMORY_CTRL, info); break;
case BLKIO_CTRL:infos.put(BLKIO_CTRL, info); break;
case PIDS_CTRL: infos.put(PIDS_CTRL, info); break;
}

-

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v2]

2022-05-24 Thread Ioi Lam
On Tue, 24 May 2022 19:34:16 GMT, Severin Gehwolf  wrote:

>> This is not a fall-through because the previous line ends with a `break`.
>
> My bad. How about `Intentional incomplete switch. There are ...`? Anyway, why 
> is the empty `default` case needed other than for the comment?

To me, the `default:` switch is a clear indication that "everything else comes 
here". So you won't be confused whether the comment is related to the last line 
just above the comment.

-

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v2]

2022-05-24 Thread Ioi Lam
On Tue, 24 May 2022 10:12:31 GMT, Severin Gehwolf  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   @jerboaa comments
>
> src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java 
> line 155:
> 
>> 153: // There are some controllers (such as freezer) that 
>> Java doesn't
>> 154: // care about. Just ignore them. These are not 
>> considered in the
>> 155: // anyCgroupsV1Controller/anyCgroupsV1Controller checks.
> 
> It's not clear why this `default` is necessary. Could we just add the comment 
> like so:
> 
> 
> // Intentional fall-through. There are controllers (such as freezer) that
> // Java doesn't care about. Just ignore them. Only listed controllers are
> // considered in the anyCgroupsV1Controller/anyCgroupsV2Controller checks.

This is not a fall-through because the previous line ends with a `break`.

> src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java 
> line 229:
> 
>> 227: String name = tokens[1];
>> 228: if (!name.equals("")) {
>> 229: // This is probably a v1 controller that we have ignored 
>> (e.g., freezer)
> 
> Could we add an assertion that the controller isn't in the `infos` map?
> 
>if (!name.equals("")) {
> // This must be a v1 controller that we have ignored (e.g., 
> freezer)
> assert infos.get(name) == null;
> return;
>  }

Fixed.

-

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller [v2]

2022-05-24 Thread Ioi Lam
> This PR fixes a bug found on an Ubuntu host that's mostly running with 
> cgroupv2, but there's a controller (freezer) that is mounted in cgroupv1 mode.
> 
> The container support code in the VM and JDK checks if we have simultaneously 
> mounted v1 and v2 containers. If so, we revert to "hybrid" mode (which is 
> essentially v1).
> 
> However, the "hybrid checks" only considers the following controllers that 
> Java cares about:
> 
> - cpu
> - cpuacct
> - cpuset
> - blkio
> - memory
> - pids
> 
> If only "freezer" is mounted in v1 mode, Java will start in V2 mode. Later, 
> when `setCgroupV2Path()` sees the first line in /proc/self/cgroup, it runs 
> into the assert.
> 
> 
> $ cat /proc/self/cgroup
> 1:freezer:/
> 0::/user.slice/user-1001.slice/session-85.scope
> 
> 
> The fix is simple -- when we get to `setCgroupV2Path()`, we have already 
> decided that the system has not mounted any v1 controllers that we care 
> about, so we should just ignore all the v1 controllers (which has a non-empty 
> name).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  @jerboaa comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8858/files
  - new: https://git.openjdk.java.net/jdk/pull/8858/files/c4d8354d..1f17b6e8

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=00-01

  Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8858.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8858/head:pull/8858

PR: https://git.openjdk.java.net/jdk/pull/8858


RFR: 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer controller

2022-05-23 Thread Ioi Lam
This PR fixes a bug found on an Ubuntu host that's mostly running with 
cgroupv2, but there's a controller (freezer) that is mounted in cgroupv1 mode.

The container support code in the VM and JDK checks if we have simultaneously 
mounted v1 and v2 containers. If so, we revert to "hybrid" mode (which is 
essentially v1).

However, the "hybrid checks" only considers the following controllers that Java 
cares about:

- cpu
- cpuacct
- cpuset
- blkio
- memory
- pids

If only "freezer" is mounted in v1 mode, Java will start in V2 mode. Later, 
when `setCgroupV2Path()` sees the first line in /proc/self/cgroup, it runs into 
the assert.


$ cat /proc/self/cgroup
1:freezer:/
0::/user.slice/user-1001.slice/session-85.scope


The fix is simple -- when we get to `setCgroupV2Path()`, we have already 
decided that the system has not mounted any v1 controllers that we care about, 
so we should just ignore all the v1 controllers (which has a non-empty name).

-

Commit messages:
 - 8287107: CgroupSubsystemFactory.setCgroupV2Path asserts with freezer 
controller

Changes: https://git.openjdk.java.net/jdk/pull/8858/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8858&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8287107
  Stats: 116 lines in 3 files changed: 113 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8858.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8858/head:pull/8858

PR: https://git.openjdk.java.net/jdk/pull/8858


Re: RFR: 8286212: Cgroup v1 initialization causes NPE on some systems [v3]

2022-05-19 Thread Ioi Lam
On Thu, 19 May 2022 19:59:18 GMT, Ioi Lam  wrote:

>>> What happens if cgroup_path is "/foo/bar" and _root is "/fo"?
>> 
>> With a mount path of `/bar` this ends up being `/bar/o/bar`. Pretty strange, 
>> but then again it's a bit of a contrived example as those paths come from 
>> `/proc` parsing. Anyway, this is code that got added with 
>> [JDK-8146115](https://bugs.openjdk.java.net/browse/JDK-8146115). It's not 
>> something I've written and to be honest, I'm not sure this branch is needed, 
>> but I didn't want to change the existing behaviour with this patch. I have 
>> no more insight than you in terms of why that approach has been taken.
>> 
>>> Maybe this block should be combined with the new `else` block you're adding?
>> 
>> Maybe, but I'm not sure if it would break something.
>> 
>>> However, the behavior seems opposite between these two blocks of code:
>>> 
>>> The upper block: _root is a prefix of cgroup_path, we take the **tail** of 
>>> cgroup_path
>>> 
>>> ```
>>>   TestCase substring_match = {
>>> "/sys/fs/cgroup/memory",   // mount_path
>>> "/user.slice/user-1000.slice", // root_path
>>> "/user.slice/user-1000.slice/user@1001.service",   // cgroup_path
>>> "/sys/fs/cgroup/memory/user@1001.service"  // expected_path
>>>   };
>>> ```
>> 
>> Yes. Though, I cannot comment on why that has been chosen. It's been there 
>> since day one :-/
>> 
>>> The lower block: The beginning of _root is a prefix of cgroup_path, we take 
>>> the **head** of cgroup_path
>>> 
>>> ```
>>>  TestCase prefix_matched_cg = {
>>> "/sys/fs/cgroup/memory",   // mount_path
>>> "/user.slice/user-1000.slice/session-50.scope",// root_path
>>> "/user.slice/user-1000.slice/session-3.scope", // cgroup_path
>>> "/sys/fs/cgroup/memory/user.slice/user-1000.slice" // expected_path
>>>   };
>>> ```
>>> 
>>> I find the behavior hard to understand. I think the rules (and reasons) 
>>> should be added to the comment block above the function.
>> 
>> The reason why I've gone down the path of adding the head of cgroup_path is 
>> because of this document (in conjunction to what the user was observing on 
>> an affected system):
>> https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
>> 
>> The user was observing paths as listed in the test:
>> 
>> "/user.slice/user-1000.slice/session-50.scope",// root_path
>> "/user.slice/user-1000.slice/session-3.scope", // cgroup_path
>> 
>> This very much looks like systemd managed. Given that and knowing that 
>> systemd adds processes into `scopes` or `services` and groups them via 
>> `slices` and also knowing that cgroups are hierarchical (i.e. limits of 
>> `/foo/bar` also apply to `/foo/bar/baz`), it seems likely that if there are 
>> any limits, then it'll be on `/user.slice/user-1000.slice` within the 
>> mounted controller. Unfortunately, I'm not able to reproduce this myself.
>
> I am wondering if the problem is this:
> 
> We have systemd running on the host, and a different copy of systemd that 
> runs inside the container.
> 
> - They both set up `/user.slice/user-1000.slice/session-??.scope` within 
> their own file systems
> - For some reason, when you're looking inside the container, 
> `/proc/self/cgroup` might use a path in the containerized file system whereas 
> `/proc/self/mountinfo` uses a path in the host file system. These two paths 
> may look alike but they have absolutely no relation to each other.
> 
> I have asked the reporter for more information:
> 
> https://gist.github.com/gaol/4d96eace8290e6549635fdc0ea41d0b4?permalink_comment_id=4172593#gistcomment-4172593
> 
> Meanwhile, I think the current method of finding "which directory under 
> /sys/fs/cgroup/memory controls my memory usage" is broken. As mentioned 
> about, the path you get from  `/proc/self/cgroup`  and `/proc/self/mountinfo` 
> have no relation to each other, but we use them anyway to get our answer, 
> with many ad-hoc methods that are not documented in the code.
> 
> Maybe we should do this instead?
> 
> - Read /proc/self/cgroup
> - Find the `10:memory:` line
> - If `/sys/fs/cgroup/m

Re: RFR: 8286212: Cgroup v1 initialization causes NPE on some systems [v3]

2022-05-19 Thread Ioi Lam
On Thu, 19 May 2022 09:06:18 GMT, Severin Gehwolf  wrote:

>> src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 63:
>> 
>>> 61:   } else {
>>> 62: char *p = strstr(cgroup_path, _root);
>>> 63: if (p != NULL && p == cgroup_path) {
>> 
>> What happens if cgroup_path is "/foo/bar" and _root is "/fo"?
>> 
>> Maybe this block should be combined with the new `else` block you're adding? 
>> However, the behavior seems opposite between these two blocks of code:
>> 
>> The upper block: _root is a prefix of cgroup_path, we take the **tail** of 
>> cgroup_path
>> 
>> 
>>   TestCase substring_match = {
>> "/sys/fs/cgroup/memory",   // mount_path
>> "/user.slice/user-1000.slice", // root_path
>> "/user.slice/user-1000.slice/user@1001.service",   // cgroup_path
>> "/sys/fs/cgroup/memory/user@1001.service"  // expected_path
>>   };
>> 
>> 
>> The lower block: The beginning of _root is a prefix of cgroup_path, we take 
>> the **head** of cgroup_path
>> 
>> 
>>  TestCase prefix_matched_cg = {
>> "/sys/fs/cgroup/memory",   // mount_path
>> "/user.slice/user-1000.slice/session-50.scope",// root_path
>> "/user.slice/user-1000.slice/session-3.scope", // cgroup_path
>> "/sys/fs/cgroup/memory/user.slice/user-1000.slice" // expected_path
>>   };
>> 
>> 
>> I find the behavior hard to understand. I think the rules (and reasons) 
>> should be added to the comment block above the function.
>
>> What happens if cgroup_path is "/foo/bar" and _root is "/fo"?
> 
> With a mount path of `/bar` this ends up being `/bar/o/bar`. Pretty strange, 
> but then again it's a bit of a contrived example as those paths come from 
> `/proc` parsing. Anyway, this is code that got added with 
> [JDK-8146115](https://bugs.openjdk.java.net/browse/JDK-8146115). It's not 
> something I've written and to be honest, I'm not sure this branch is needed, 
> but I didn't want to change the existing behaviour with this patch. I have no 
> more insight than you in terms of why that approach has been taken.
> 
>> Maybe this block should be combined with the new `else` block you're adding?
> 
> Maybe, but I'm not sure if it would break something.
> 
>> However, the behavior seems opposite between these two blocks of code:
>> 
>> The upper block: _root is a prefix of cgroup_path, we take the **tail** of 
>> cgroup_path
>> 
>> ```
>>   TestCase substring_match = {
>> "/sys/fs/cgroup/memory",   // mount_path
>> "/user.slice/user-1000.slice", // root_path
>> "/user.slice/user-1000.slice/user@1001.service",   // cgroup_path
>> "/sys/fs/cgroup/memory/user@1001.service"  // expected_path
>>   };
>> ```
> 
> Yes. Though, I cannot comment on why that has been chosen. It's been there 
> since day one :-/
> 
>> The lower block: The beginning of _root is a prefix of cgroup_path, we take 
>> the **head** of cgroup_path
>> 
>> ```
>>  TestCase prefix_matched_cg = {
>> "/sys/fs/cgroup/memory",   // mount_path
>> "/user.slice/user-1000.slice/session-50.scope",// root_path
>> "/user.slice/user-1000.slice/session-3.scope", // cgroup_path
>> "/sys/fs/cgroup/memory/user.slice/user-1000.slice" // expected_path
>>   };
>> ```
>> 
>> I find the behavior hard to understand. I think the rules (and reasons) 
>> should be added to the comment block above the function.
> 
> The reason why I've gone down the path of adding the head of cgroup_path is 
> because of this document (in conjunction to what the user was observing on an 
> affected system):
> https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
> 
> The user was observing paths as listed in the test:
> 
> "/user.slice/user-1000.slice/session-50.scope",// root_path
> "/user.slice/user-1000.slice/session-3.scope", // cgroup_path
> 
> This very much looks like systemd managed. Given that and knowing that 
> systemd adds processes into `scopes` or `services` and groups them via 
> `slices` and also knowing that cgroups are hierarchical (i.e. limits of 
> `/foo/bar` also apply to `/foo/bar/baz`), it seems likely that if there are 
> any limits, then it'll be on `/user.slice/user-1000.slice` within the mounted 
> controller. Unfortunately, I'm not able to reproduce this myself.

I am wondering if the problem is this:

We have systemd running on the host, and a different copy of systemd that runs 
inside the container.

- They both set up `/user.slice/user-1000.slice/session-??.scope` within their 
own file systems
- For some reason, when you're looking inside the container, 
`/proc/self/cgroup` might use a path in the containerized file system whereas 
`/proc/self/mountinfo` uses a path in the host file system. These two paths may 
look alike but they have absolutely no relation to each other.

I have asked the reporter for more information:

https://gist.github.c

Re: RFR: 8286212: Cgroup v1 initialization causes NPE on some systems [v3]

2022-05-18 Thread Ioi Lam
On Wed, 18 May 2022 18:14:52 GMT, Severin Gehwolf  wrote:

>> Please review this change to the cgroup v1 subsystem which makes it more 
>> resilient on some of the stranger systems. Unfortunately, I wasn't able to 
>> re-create a similar system as the reporter. The idea of using the longest 
>> substring match for the cgroupv1 file paths was based on the fact that on 
>> systemd systems processes run in separate scopes and the maven forked test 
>> runner might exhibit this property. For that it makes sense to use the 
>> common ancestor path. Nothing changes in the common cases where the 
>> `cgroup_path` matches `_root` and when the `_root` is `/` (container the 
>> former, host system the latter).
>> 
>> In addition, the code paths which are susceptible to throw NPE have been 
>> hardened to catch those situations. Should it happen in the future it makes 
>> more sense (to me) to not have accurate container detection in favor of 
>> continuing to keep running.
>> 
>> Finally, with the added unit-tests a bug was uncovered on the "substring" 
>> match case of cgroup paths in hotspot. `p` returned from `strstr` can never 
>> point to `_root` as it's used as the "needle" to find in "haystack" 
>> `cgroup_path` (not the other way round).
>> 
>> Testing:
>> - [x] Added unit tests
>> - [x] GHA
>> - [x] Container tests on cgroups v1 Linux. Continue to pass
>
> Severin Gehwolf has updated the pull request incrementally with two 
> additional commits since the last revision:
> 
>  - Refactor hotspot gtest
>  - Separate into function. Fix comment.

Changes requested by iklam (Reviewer).

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 60:

> 58: strncpy(buf, _mount_point, MAXPATHLEN);
> 59: buf[MAXPATHLEN-1] = '\0';
> 60: _path = os::strdup(buf);

Comment on the old code: why this cannot be simply


_path = os::strdup(_mount_point);


Also, all the strncat calls in this function seem problematic, and should be 
converted to stringStream for consistency.

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 63:

> 61:   } else {
> 62: char *p = strstr(cgroup_path, _root);
> 63: if (p != NULL && p == cgroup_path) {

What happens if cgroup_path is "/foo/bar" and _root is "/fo"?

Maybe this block should be combined with the new `else` block you're adding? 
However, the behavior seems opposite between these two blocks of code:

The upper block: _root is a prefix of cgroup_path, we take the **tail** of 
cgroup_path


  TestCase substring_match = {
"/sys/fs/cgroup/memory",   // mount_path
"/user.slice/user-1000.slice", // root_path
"/user.slice/user-1000.slice/user@1001.service",   // cgroup_path
"/sys/fs/cgroup/memory/user@1001.service"  // expected_path
  };


The lower block: The beginning of _root is a prefix of cgroup_path, we take the 
**head** of cgroup_path


 TestCase prefix_matched_cg = {
"/sys/fs/cgroup/memory",   // mount_path
"/user.slice/user-1000.slice/session-50.scope",// root_path
"/user.slice/user-1000.slice/session-3.scope", // cgroup_path
"/sys/fs/cgroup/memory/user.slice/user-1000.slice" // expected_path
  };


I find the behavior hard to understand. I think the rules (and reasons) should 
be added to the comment block above the function.

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 86:

> 84:   const char* cgroup_p = cgroup_path;
> 85:   int last_slash = find_last_slash_pos(root_p, cgroup_p);
> 86:   assert(last_slash >= 0, "not an absolute path?");

Are root_p and cgroup_p directly read from the /proc/xxx files. If so, do we 
validate the input to make sure they are absolute paths?

It seems like our code cannot handle trailing '/' in the input. If so, we 
should clear all trailing '/' from the input string. Then, in functions that 
process them, we should assert that they don't end with slash. See my comment 
in find_last_slash_pos().

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 102:

> 100:   for (int i = 0; *s1 == *s2 && *s1 != 0; i++) {
> 101: if (*s1 == '/') {
> 102:   last_matching_slash_pos = i;

I found the behavior a little hard to understand. Is it intentional?


"/cat/cow", "/cat/dog"-> "/cat/"
"/cat/","/cat/dog"-> "/cat/"
"/cat", "/cat/dog"-> "/"


The name `find_last_slash_pos` doesn't properly describe the behavior. I 
thought about `find_common_path_prefix`, but that doesn't match the current 
behavior (for the 3rd case, the common path prefix is `"/cat"`).

-

PR: https://git.openjdk.java.net/jdk/pull/8629


Re: RFR: 8286212: Cgroup v1 initialization causes NPE on some systems [v2]

2022-05-16 Thread Ioi Lam
On Thu, 12 May 2022 18:09:40 GMT, Severin Gehwolf  wrote:

>> Please review this change to the cgroup v1 subsystem which makes it more 
>> resilient on some of the stranger systems. Unfortunately, I wasn't able to 
>> re-create a similar system as the reporter. The idea of using the longest 
>> substring match for the cgroupv1 file paths was based on the fact that on 
>> systemd systems processes run in separate scopes and the maven forked test 
>> runner might exhibit this property. For that it makes sense to use the 
>> common ancestor path. Nothing changes in the common cases where the 
>> `cgroup_path` matches `_root` and when the `_root` is `/` (container the 
>> former, host system the latter).
>> 
>> In addition, the code paths which are susceptible to throw NPE have been 
>> hardened to catch those situations. Should it happen in the future it makes 
>> more sense (to me) to not have accurate container detection in favor of 
>> continuing to keep running.
>> 
>> Finally, with the added unit-tests a bug was uncovered on the "substring" 
>> match case of cgroup paths in hotspot. `p` returned from `strstr` can never 
>> point to `_root` as it's used as the "needle" to find in "haystack" 
>> `cgroup_path` (not the other way round).
>> 
>> Testing:
>> - [x] Added unit tests
>> - [x] GHA
>> - [x] Container tests on cgroups v1 Linux. Continue to pass
>
> Severin Gehwolf has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Use stringStream to simplify controller path assembly

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 92:

> 90:   }
> 91:   ss.print_raw(_root, last_matching_slash_pos);
> 92:   _path = os::strdup(ss.base());

Do you mean `Find the longest common prefix`? Maybe give an example in the 
comments? Text parsing in C code is really difficult to understand.

-

PR: https://git.openjdk.java.net/jdk/pull/8629


Re: RFR: 8286212: Cgroup v1 initialization causes NPE on some systems [v2]

2022-05-16 Thread Ioi Lam
On Thu, 12 May 2022 18:09:40 GMT, Severin Gehwolf  wrote:

>> Please review this change to the cgroup v1 subsystem which makes it more 
>> resilient on some of the stranger systems. Unfortunately, I wasn't able to 
>> re-create a similar system as the reporter. The idea of using the longest 
>> substring match for the cgroupv1 file paths was based on the fact that on 
>> systemd systems processes run in separate scopes and the maven forked test 
>> runner might exhibit this property. For that it makes sense to use the 
>> common ancestor path. Nothing changes in the common cases where the 
>> `cgroup_path` matches `_root` and when the `_root` is `/` (container the 
>> former, host system the latter).
>> 
>> In addition, the code paths which are susceptible to throw NPE have been 
>> hardened to catch those situations. Should it happen in the future it makes 
>> more sense (to me) to not have accurate container detection in favor of 
>> continuing to keep running.
>> 
>> Finally, with the added unit-tests a bug was uncovered on the "substring" 
>> match case of cgroup paths in hotspot. `p` returned from `strstr` can never 
>> point to `_root` as it's used as the "needle" to find in "haystack" 
>> `cgroup_path` (not the other way round).
>> 
>> Testing:
>> - [x] Added unit tests
>> - [x] GHA
>> - [x] Container tests on cgroups v1 Linux. Continue to pass
>
> Severin Gehwolf has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Use stringStream to simplify controller path assembly

test/hotspot/gtest/runtime/test_os_linux_cgroups.cpp line 63:

> 61: ASSERT_STREQ(expected_cg_paths[i], ctrl->subsystem_path());
> 62:   }
> 63: }

I found it hard to relate the different paths. Could you create a new struct 
like this?


struct TestCase {
char* mount_path;
char* root_paths;
char* cgroup_path;
char* expected_cg_paths;
} = {
  {  "/sys/fs/cgroup/memory", // mount
   "/",   // root,
   

-

PR: https://git.openjdk.java.net/jdk/pull/8629


Re: RFR: JDK-8281001 Examine the behavior of Class::forName if the caller is null

2022-05-16 Thread Ioi Lam
On Sat, 14 May 2022 00:30:00 GMT, Tim Prinzing  wrote:

> The Class::forName behavior change to match JNI FindClass is a compatible 
> change and seems pretty attractive as it would be expected that 
> Class::forName would give the same behavior as FindClass which uses the 
> system classloader.  The test for 8281006 was enhanced to test for this 
> change.  Merged master to pick up fixes to unrelated test failures to reduce 
> noise.

Please rename the title of the issue to reflect what is being proposed.

-

PR: https://git.openjdk.java.net/jdk/pull/8711


Re: RFR: 8284950: CgroupV1 detection code should consider memory.swappiness [v18]

2022-05-15 Thread Ioi Lam
On Thu, 12 May 2022 13:46:11 GMT, xpbob  wrote:

>> set memory.swappiness to 0,swap space will not be used 
>> determine the value of memory.swappiness
>> https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
>> 
>> 
>> Memory Limit: 50.00M
>> Memory Soft Limit: Unlimited
>> Memory & Swap Limit: 100.00M
>> Maximum Processes Limit: 4194305 
>> 
>> =>
>> 
>> Memory Limit: 50.00M
>> Memory Soft Limit: Unlimited
>> Memory & Swap Limit: 50.00M
>> Maximum Processes Limit: 4194305
>
> xpbob has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   rename method

LGTM

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8285


Integrated: 8286560: Remove user parameter from jdk.internal.perf.Perf.attach()

2022-05-12 Thread Ioi Lam
On Wed, 11 May 2022 23:08:32 GMT, Ioi Lam  wrote:

> The API `jdk.internal.perf.Perf.::attach(String user, int lvmid)` is never 
> used. It should be removed, and all the handling of a specified user name 
> should be removed.

This pull request has now been integrated.

Changeset: 74eee28a
Author:    Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/74eee28a710f2d0c9f613522ee3d228d6b601252
Stats: 123 lines in 5 files changed: 2 ins; 97 del; 24 mod

8286560: Remove user parameter from jdk.internal.perf.Perf.attach()

Reviewed-by: dholmes, alanb

-

PR: https://git.openjdk.java.net/jdk/pull/8669


Re: RFR: 8286560: Remove user parameter from jdk.internal.perf.Perf.attach() [v2]

2022-05-12 Thread Ioi Lam
On Thu, 12 May 2022 04:06:44 GMT, David Holmes  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   @AlanBateman comments - remove thros IllegalArgumentException clause
>
> Nice cleanup! I checked back in JDK 7 and couldn't find any use of this 
> particular API.
> 
> Thanks.

Thanks to @dholmes-ora and @AlanBateman for the review.

-

PR: https://git.openjdk.java.net/jdk/pull/8669


Re: RFR: 8286560: Remove user parameter from jdk.internal.perf.Perf.attach() [v2]

2022-05-12 Thread Ioi Lam
On Thu, 12 May 2022 14:08:11 GMT, Alan Bateman  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   @AlanBateman comments - remove thros IllegalArgumentException clause
>
> src/java.base/share/classes/jdk/internal/perf/Perf.java line 246:
> 
>> 244:  */
>> 245: private native ByteBuffer attach0(int lvmid)
>> 246:throws IllegalArgumentException, IOException;
> 
> You can drop the "throws IllegalArgumentException" here if you want, it's not 
> needed as it's a runtime exception.

Fixed.

-

PR: https://git.openjdk.java.net/jdk/pull/8669


Re: RFR: 8286560: Remove user parameter from jdk.internal.perf.Perf.attach() [v2]

2022-05-12 Thread Ioi Lam
> The API `jdk.internal.perf.Perf.::attach(String user, int lvmid)` is never 
> used. It should be removed, and all the handling of a specified user name 
> should be removed.

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  @AlanBateman comments - remove thros IllegalArgumentException clause

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8669/files
  - new: https://git.openjdk.java.net/jdk/pull/8669/files/afa66232..0b73a9d4

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8669&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8669&range=00-01

  Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8669.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8669/head:pull/8669

PR: https://git.openjdk.java.net/jdk/pull/8669


Re: RFR: 8286212: Cgroup v1 initialization causes NPE on some systems

2022-05-11 Thread Ioi Lam
On Tue, 10 May 2022 12:29:10 GMT, Severin Gehwolf  wrote:

> Please review this change to the cgroup v1 subsystem which makes it more 
> resilient on some of the stranger systems. Unfortunately, I wasn't able to 
> re-create a similar system as the reporter. The idea of using the longest 
> substring match for the cgroupv1 file paths was based on the fact that on 
> systemd systems processes run in separate scopes and the maven forked test 
> runner might exhibit this property. For that it makes sense to use the common 
> ancestor path. Nothing changes in the common cases where the `cgroup_path` 
> matches `_root` and when the `_root` is `/` (container the former, host 
> system the latter).
> 
> In addition, the code paths which are susceptible to throw NPE have been 
> hardened to catch those situations. Should it happen in the future it makes 
> more sense (to me) to not have accurate container detection in favor of 
> continuing to keep running.
> 
> Finally, with the added unit-tests a bug was uncovered on the "substring" 
> match case of cgroup paths in hotspot. `p` returned from `strstr` can never 
> point to `_root` as it's used as the "needle" to find in "haystack" 
> `cgroup_path` (not the other way round).
> 
> Testing:
> - [x] Added unit tests
> - [x] GHA
> - [x] Container tests on cgroups v1 Linux. Continue to pass

I just started to look at the code so just one comment for now.

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java
 line 65:

> 63: path = mountPoint + cgroupSubstr;
> 64: }
> 65: } else {

Looks like `path` is still not set if the condition at line 61 `if 
(cgroupPath.length() > root.length()) {` is false.

-

PR: https://git.openjdk.java.net/jdk/pull/8629


RFR: 8286560: Remove user parameter from jdk.internal.perf.Perf.attach()

2022-05-11 Thread Ioi Lam
The API `jdk.internal.perf.Perf.::attach(String user, int lvmid)` is never 
used. It should be removed, and all the handling of a specified user name 
should be removed.

-

Commit messages:
 - more cleanup
 - 8286560: Remove user parameter from jdk.internal.perf.Perf.attach()

Changes: https://git.openjdk.java.net/jdk/pull/8669/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8669&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8286560
  Stats: 120 lines in 5 files changed: 2 ins; 95 del; 23 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8669.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8669/head:pull/8669

PR: https://git.openjdk.java.net/jdk/pull/8669


Integrated: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach()

2022-05-11 Thread Ioi Lam
On Tue, 10 May 2022 04:00:29 GMT, Ioi Lam  wrote:

> The `mode` parameter for ` jdk.internal.perf.Perf.attach()` has never 
> supported the value `"rw"` since the source code was imported to the openjdk 
> repo more than 15 years ago. In fact HotSpot throws 
> `IllegalArgumentException` when such a mode is specified.
> 
> It's unlikely such a mode will be required for future enhancements. Support 
> for `"rw"` is removed. The "mode" parameter is also removed, since now `"r"` 
> is the only supported mode.
> 
> I also cleaned up related code in the JDK and HotSpot.
> 
> Testing:
> - Passed tiers 1 ~ 5

This pull request has now been integrated.

Changeset: fcf49f42
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/fcf49f42cef4ac3e50b3b480aecf6fa38cf5be00
Stats: 205 lines in 15 files changed: 3 ins; 153 del; 49 mod

8286441: Remove mode parameter from jdk.internal.perf.Perf.attach()

Reviewed-by: redestad, alanb

-

PR: https://git.openjdk.java.net/jdk/pull/8622


Re: RFR: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach()

2022-05-11 Thread Ioi Lam
On Tue, 10 May 2022 20:03:45 GMT, Alan Bateman  wrote:

>> The `mode` parameter for ` jdk.internal.perf.Perf.attach()` has never 
>> supported the value `"rw"` since the source code was imported to the openjdk 
>> repo more than 15 years ago. In fact HotSpot throws 
>> `IllegalArgumentException` when such a mode is specified.
>> 
>> It's unlikely such a mode will be required for future enhancements. Support 
>> for `"rw"` is removed. The "mode" parameter is also removed, since now `"r"` 
>> is the only supported mode.
>> 
>> I also cleaned up related code in the JDK and HotSpot.
>> 
>> Testing:
>> - Passed tiers 1 ~ 5
>
> I skimmed through the changes and I think they look okay. In the distant past 
> there were tools outside of the JDK that used the jvmstat API directly. It's 
> possible that VisualVM still does but it would only compile/run if 
> --add-exports is used to export the sun.jvmstat.* packages. So it might be 
> that dropping the parameter from a method in RemoteHost is noticed and I 
> think that is okay because this package is not exported and is not meant to 
> be used by code outside of the JDK.

Thanks to @AlanBateman and @cl4es for the review.

-

PR: https://git.openjdk.java.net/jdk/pull/8622


Re: RFR: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach()

2022-05-10 Thread Ioi Lam
On Tue, 10 May 2022 20:03:45 GMT, Alan Bateman  wrote:

> I skimmed through the changes and I think they look okay. In the distant past 
> there were tools outside of the JDK that used the jvmstat API directly. It's 
> possible that VisualVM still does but it would only compile/run if 
> --add-exports is used to export the sun.jvmstat.* packages. So it might be 
> that dropping the parameter from a method in RemoteHost is noticed and I 
> think that is okay because this package is not exported and is not meant to 
> be used by code outside of the JDK.

The APIs changed by this PR are:

- sun.jvmstat.monitor.remote.RemoteHost::attachVm
- sun.jvmstat.monitor.VmIdentifier::getMode
- sun.jvmstat.monitor.HostIdentifier::getMode
- jdk.internal.perf.Perf::attach

I checked the latest VisualVM source code. Some of the above classes are 
referenced, but none of the affected APIs are called.

-

PR: https://git.openjdk.java.net/jdk/pull/8622


Re: RFR: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach() [v2]

2022-05-10 Thread Ioi Lam
On Tue, 10 May 2022 19:59:41 GMT, Alan Bateman  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   review comments
>
> src/jdk.internal.jvmstat/share/classes/sun/jvmstat/perfdata/monitor/protocol/file/PerfDataBuffer.java
>  line 60:
> 
>> 58: FileChannel fc = new RandomAccessFile(f, "r").getChannel();
>> 59: ByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0L, 
>> (int)fc.size());
>> 60: fc.close();   // doesn't need to remain open
> 
> I think you can change this to:
> 
> 
> try (FileChannel fc = FileChannel.open(f.toPath())) {
> ByteBuffer bb = ...
> createPerfDataBuffer(bb, 0);
> }

Fixed.

-

PR: https://git.openjdk.java.net/jdk/pull/8622


Re: RFR: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach() [v2]

2022-05-10 Thread Ioi Lam
> The `mode` parameter for ` jdk.internal.perf.Perf.attach()` has never 
> supported the value `"rw"` since the source code was imported to the openjdk 
> repo more than 15 years ago. In fact HotSpot throws 
> `IllegalArgumentException` when such a mode is specified.
> 
> It's unlikely such a mode will be required for future enhancements. Support 
> for `"rw"` is removed. The "mode" parameter is also removed, since now `"r"` 
> is the only supported mode.
> 
> I also cleaned up related code in the JDK and HotSpot.
> 
> Testing:
> - Passed tiers 1 and 2
> - Tiers 3, 4, 5 are in progress

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  review comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8622/files
  - new: https://git.openjdk.java.net/jdk/pull/8622/files/22c22c30..34a01f71

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8622&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8622&range=00-01

  Stats: 14 lines in 5 files changed: 1 ins; 7 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8622.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8622/head:pull/8622

PR: https://git.openjdk.java.net/jdk/pull/8622


Re: RFR: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach() [v2]

2022-05-10 Thread Ioi Lam
On Tue, 10 May 2022 21:43:44 GMT, Claes Redestad  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   review comments
>
> src/hotspot/os/windows/perfMemory_windows.cpp line 1781:
> 
>> 1779: // address space.
>> 1780: //
>> 1781: void PerfMemory::attach(const char* user, int vmid,
> 
> One line?

Fixed

> src/hotspot/share/prims/perf.cpp line 84:
> 
>> 82: 
>> 83:   // attach to the PerfData memory region for the specified VM
>> 84:   PerfMemory::attach(user_utf, vmid,
> 
> One line?

Fixed

> src/hotspot/share/runtime/perfMemory.hpp line 146:
> 
>> 144: // methods for attaching to and detaching from the PerfData
>> 145: // memory segment of another JVM process on the same system.
>> 146: static void attach(const char* user, int vmid,
> 
> One line?

Fixed

> src/jdk.jstatd/share/classes/sun/tools/jstatd/RemoteHostImpl.java line 74:
> 
>> 72: Integer v = lvmid;
>> 73: RemoteVm stub = null;
>> 74: StringBuilder sb = new StringBuilder();
> 
> Suggestion:
> 
> String vmidStr = "local://" + lvmid + "@localhost";

Fixed

-

PR: https://git.openjdk.java.net/jdk/pull/8622


RFR: 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach()

2022-05-09 Thread Ioi Lam
The `mode` parameter for ` jdk.internal.perf.Perf.attach()` has never supported 
the value `"rw"` since the source code was imported to the openjdk repo more 
than 15 years ago. In fact HotSpot throws `IllegalArgumentException` when such 
a mode is specified.

It's unlikely such a mode will be required for future enhancements. Support for 
`"rw"` is removed. The "mode" parameter is also removed, since now `"r"` is the 
only supported mode.

I also cleaned up related code in the JDK and HotSpot.

Testing:
- Passed tiers 1 and 2
- Tiers 3, 4, 5 are in progress

-

Commit messages:
 - 8286441: Remove mode parameter from jdk.internal.perf.Perf.attach()

Changes: https://git.openjdk.java.net/jdk/pull/8622/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8622&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8286441
  Stats: 193 lines in 15 files changed: 0 ins; 144 del; 49 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8622.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8622/head:pull/8622

PR: https://git.openjdk.java.net/jdk/pull/8622


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v3]

2022-04-29 Thread Ioi Lam
On Mon, 11 Apr 2022 14:55:32 GMT, Thomas Schatzl  wrote:

>> Hi all,
>> 
>>   can I have reviews for this change that adds dedicated filler objects to 
>> the VM?
>> 
>> Currently, when formatting areas of dead objects all gcs use instances of 
>> j.l.Object and int-arrays.
>> 
>> This has the drawback of not being easily able to discern whether a given 
>> object is actually dead (and should never be referenced) or just a regular 
>> j.l.Object/int array.
>> 
>> This also makes enhanced error detection (any reference to such an object is 
>> an error - i.e. detecting references to such objects) and to skip 
>> potentially already unloaded classes when scanning areas of the heap below 
>> TAMS, G1 uses its prev bitmap.
>> Other collectors do not have this extra information at the moment, so they 
>> can't (and don't) do this kind of verification.
>> 
>> With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the 
>> prev bitmap will effectively be removed in G1; G1 will format the dead areas 
>> with these filler objects to avoid coming across unloaded classes. This is 
>> fine wrt to normal operation, however, this looses the existing enhanced 
>> verification mentioned above.
>> 
>> This change proposes to add dedicated VM-internal filler objects, i.e. 
>> equivalents of j.l.Object and int-arrays.
>> 
>> This has the following benefits:
>> 
>> - keep this error detection (actually making it much simpler) and allowing 
>> similar verification for other collectors. (This change does not add this)
>> 
>> - this also makes it "easy" to detect references to filler objects in 
>> debugging tools - you only need to know the two klasses (or just get their 
>> friendly name) to see whether that reference may actually be valid (or 
>> refers to the inside such an object). References to these classes in the 
>> crash file may also allow the issue to be more clear.
>> 
>> This causes some minor changes to external behavior:
>> 
>> - logs/heap dumps now contain instances of these objects - which seems fine 
>> as previously they have just been reported as part of j.l.Object/int-arrays 
>> statistics. The VM spec also does not guarantee whether a particular kind of 
>> object should/should not show there anyway afaik.
>> 
>> - if the application ever gets to instantiate a reference to such an object 
>> somehow, any enabled verification will crash the VM. That's bad luck for 
>> messing with internal classes, but that's the purpose of these objects.
>> 
>> The change takes care that getting a reference will not be possible by 
>> normal means (i.e. via Class.forName() etc) - which should be sufficient to 
>> avoid the issue. Actually, existing mechanisms seem to be sufficient.
>> 
>> 
>> Testing: tier1-8
>> 
>> There is one question I would like the reviewers to specially think about, 
>> the name of the filler array klass. I just used 
>> `Ljava/internal/vm/FillerArray;` for that, looking at other internal 
>> symbols/klasses, but I'm not sure this adheres to naming guidelines.
>> 
>> Thanks go to @iklam for helping out with the change.
>> 
>> Thanks,
>>   Thomas
>
> Thomas Schatzl has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fix test

The latest version looks good to me.

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284950: Swappiness disables swap space usage

2022-04-19 Thread Ioi Lam
On Mon, 18 Apr 2022 09:07:31 GMT, xpbob  wrote:

> set memory.swappiness to 0,swap space will not be used 
> determine the value of memory.swappiness
> https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
> 
> 
> Memory Limit: 50.00M
> Memory Soft Limit: Unlimited
> Memory & Swap Limit: 100.00M
> Maximum Processes Limit: 4194305 
> 
> =>
> 
> Memory Limit: 50.00M
> Memory Soft Limit: Unlimited
> Memory & Swap Limit: 50.00M
> Maximum Processes Limit: 4194305

I changed the [JBS issue](https://bugs.openjdk.java.net/browse/JDK-8284900) 
summary to "CgroupV1 detection code should consider memory.swappiness"

-

PR: https://git.openjdk.java.net/jdk/pull/8285


Re: RFR: 8284950: Swappiness disables swap space usage

2022-04-19 Thread Ioi Lam
On Mon, 18 Apr 2022 09:07:31 GMT, xpbob  wrote:

> set memory.swappiness to 0,swap space will not be used 
> determine the value of memory.swappiness
> https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
> 
> 
> Memory Limit: 50.00M
> Memory Soft Limit: Unlimited
> Memory & Swap Limit: 100.00M
> Maximum Processes Limit: 4194305 
> 
> =>
> 
> Memory Limit: 50.00M
> Memory Soft Limit: Unlimited
> Memory & Swap Limit: 50.00M
> Maximum Processes Limit: 4194305

Both the PR and the bug report are not clear about exactly what the problem is. 

Could you provide a test case that demonstrates the relationship between 
`memory.memsw.limit_in_bytes` and `memory.swappiness`? It will be best if you 
can upload a shell script into the bug report so we know the exact steps to 
reproduce the problem.

Ultimately we should add a new jtreg test case.


# scenario 1
memory.memsw.limit_in_bytes = 1000M
memory.limit_in_bytes = 50M
memory.swappiness = 1

-> "java -Xms60m -XX:+AlwaysPreTouch -version" can successfully complete

# scenario 2
memory.memsw.limit_in_bytes = 1000M
memory.limit_in_bytes = 50M
memory.swappiness = 0

-> "java -Xms60m -XX:+AlwaysPreTouch -version" will be killed by cgroups


Related to your other PR (https://github.com/openjdk/jdk/pull/8256), I think 
`CgroupV1Subsystem::memory_and_swap_limit_in_bytes()` also need to be changed 
so that it will return 50M in scenario 2.

-

PR: https://git.openjdk.java.net/jdk/pull/8285


Re: RFR: 8284950: Swappiness disables swap space usage

2022-04-18 Thread Ioi Lam
On Mon, 18 Apr 2022 16:22:27 GMT, Ioi Lam  wrote:

>> set memory.swappiness to 0,swap space will not be used 
>> determine the value of memory.swappiness
>> https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
>> 
>> 
>> Memory Limit: 50.00M
>> Memory Soft Limit: Unlimited
>> Memory & Swap Limit: 100.00M
>> Maximum Processes Limit: 4194305 
>> 
>> =>
>> 
>> Memory Limit: 50.00M
>> Memory Soft Limit: Unlimited
>> Memory & Swap Limit: 50.00M
>> Maximum Processes Limit: 4194305
>
> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java
>  line 155:
> 
>> 153: long memswBytes = getLongValue(controller, 
>> "memory.memsw.limit_in_bytes");
>> 154: long swappiness = getLongValue(controller, "memory.swappiness");
>> 155: return (memswBytes > 0 && swappiness > 0);
> 
> Does this also need to be changed in the test?
> 
> https://github.com/openjdk/jdk/blob/c63fabe3d582ce0828b04b0224cea49aab5fedf3/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java#L291-L296

There's also corresponding code in HotSpot:

https://github.com/openjdk/jdk/blob/c63fabe3d582ce0828b04b0224cea49aab5fedf3/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L129-L150

-

PR: https://git.openjdk.java.net/jdk/pull/8285


Re: RFR: 8284950: Swappiness disables swap space usage

2022-04-18 Thread Ioi Lam
On Mon, 18 Apr 2022 09:07:31 GMT, xpbob  wrote:

> set memory.swappiness to 0,swap space will not be used 
> determine the value of memory.swappiness
> https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
> 
> 
> Memory Limit: 50.00M
> Memory Soft Limit: Unlimited
> Memory & Swap Limit: 100.00M
> Maximum Processes Limit: 4194305 
> 
> =>
> 
> Memory Limit: 50.00M
> Memory Soft Limit: Unlimited
> Memory & Swap Limit: 50.00M
> Maximum Processes Limit: 4194305

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java
 line 155:

> 153: long memswBytes = getLongValue(controller, 
> "memory.memsw.limit_in_bytes");
> 154: long swappiness = getLongValue(controller, "memory.swappiness");
> 155: return (memswBytes > 0 && swappiness > 0);

Does this also need to be changed in the test?

https://github.com/openjdk/jdk/blob/c63fabe3d582ce0828b04b0224cea49aab5fedf3/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java#L291-L296

-

PR: https://git.openjdk.java.net/jdk/pull/8285


Integrated: 8284336: CDS SignedJar.java test fails due to archived Reference object

2022-04-08 Thread Ioi Lam
On Thu, 7 Apr 2022 22:44:19 GMT, Ioi Lam  wrote:

> During `java -Xshare:dump`, `ClassLoaders.bootLoader().resourceCache` is 
> usually null. However, if a signed class is loaded, `resourceCache` will 
> point to a `java.lang.ref.SoftReference`. Although rare (we have never seen 
> this during our testing), it's possible for `resourceCache.discovered` to 
> directly or indirectly point to another `Reference` which may contain an 
> object that cannot be archived.
> 
> The fix is simple: reset the `resourceCache` field of all three archived 
> ClassLoader objects (boot/platform/app).
> 
> I cannot reproduce the problem and I am unable to write a deterministic test 
> case. However, the bug reporter has tested my preliminary patch and is no 
> longer able to reproduce the failure.
> 
> Please see the bug report 
> [JDK-8284336](https://bugs.openjdk.java.net/browse/JDK-8284336) for detailed 
> analysis and traces.

This pull request has now been integrated.

Changeset: 662320a0
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/662320a0ec0b373fa1e4df9281224e9bdbdf76ac
Stats: 28 lines in 5 files changed: 25 ins; 0 del; 3 mod

8284336: CDS SignedJar.java test fails due to archived Reference object

Reviewed-by: alanb, ccheung

-

PR: https://git.openjdk.java.net/jdk/pull/8151


Re: RFR: 8284336: CDS SignedJar.java test fails due to archived Reference object

2022-04-08 Thread Ioi Lam
On Fri, 8 Apr 2022 07:08:29 GMT, Alan Bateman  wrote:

>> During `java -Xshare:dump`, `ClassLoaders.bootLoader().resourceCache` is 
>> usually null. However, if a signed class is loaded, `resourceCache` will 
>> point to a `java.lang.ref.SoftReference`. Although rare (we have never seen 
>> this during our testing), it's possible for `resourceCache.discovered` to 
>> directly or indirectly point to another `Reference` which may contain an 
>> object that cannot be archived.
>> 
>> The fix is simple: reset the `resourceCache` field of all three archived 
>> ClassLoader objects (boot/platform/app).
>> 
>> I cannot reproduce the problem and I am unable to write a deterministic test 
>> case. However, the bug reporter has tested my preliminary patch and is no 
>> longer able to reproduce the failure.
>> 
>> Please see the bug report 
>> [JDK-8284336](https://bugs.openjdk.java.net/browse/JDK-8284336) for detailed 
>> analysis and traces.
>
> The updates to resetArchivedStates look okay.

Thanks @AlanBateman and @calvinccheung for the review!

-

PR: https://git.openjdk.java.net/jdk/pull/8151


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas

2022-04-08 Thread Ioi Lam
On Fri, 8 Apr 2022 08:13:33 GMT, Thomas Schatzl  wrote:

> Hi all,
> 
>   can I have reviews for this change that adds dedicated filler objects to 
> the VM?
> 
> Currently, when formatting areas of dead objects all gcs use instances of 
> j.l.Object and int-arrays.
> 
> This has the drawback of not being easily able to discern whether a given 
> object is actually dead (and should never be referenced) or just a regular 
> j.l.Object/int array.
> 
> This also makes enhanced error detection (any reference to such an object is 
> an error - i.e. detecting references to such objects) and to skip potentially 
> already unloaded classes when scanning areas of the heap below TAMS, G1 uses 
> its prev bitmap.
> Other collectors do not have this extra information at the moment, so they 
> can't (and don't) do this kind of verification.
> 
> With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the prev 
> bitmap will effectively be removed in G1; G1 will format the dead areas with 
> these filler objects to avoid coming across unloaded classes. This is fine 
> wrt to normal operation, however, this looses the existing enhanced 
> verification mentioned above.
> 
> This change proposes to add dedicated VM-internal filler objects, i.e. 
> equivalents of j.l.Object and int-arrays.
> 
> This has the following benefits:
> 
> - keep this error detection (actually making it much simpler) and allowing 
> similar verification for other collectors. (This change does not add this)
> 
> - this also makes it "easy" to detect references to filler objects in 
> debugging tools - you only need to know the two klasses (or just get their 
> friendly name) to see whether that reference may actually be valid (or refers 
> to the inside such an object). References to these classes in the crash file 
> may also allow the issue to be more clear.
> 
> This causes some minor changes to external behavior:
> 
> - logs/heap dumps now contain instances of these objects - which seems fine 
> as previously they have just been reported as part of j.l.Object/int-arrays 
> statistics. The VM spec also does not guarantee whether a particular kind of 
> object should/should not show there anyway afaik.
> 
> - if the application ever gets to instantiate a reference to such an object 
> somehow, any enabled verification will crash the VM. That's bad luck for 
> messing with internal classes, but that's the purpose of these objects.
> 
> The change takes care that getting a reference will not be possible by normal 
> means (i.e. via Class.forName() etc) - which should be sufficient to avoid 
> the issue. Actually, existing mechanisms seem to be sufficient.
> 
> 
> Testing: tier1-8
> 
> There is one question I would like the reviewers to specially think about, 
> the name of the filler array klass. I just used 
> `Ljava/internal/vm/FillerArray;` for that, looking at other internal 
> symbols/klasses, but I'm not sure this adheres to naming guidelines.
> 
> Thanks go to @iklam for helping out with the change.
> 
> Thanks,
>   Thomas

Changes requested by iklam (Reviewer).

src/hotspot/share/classfile/systemDictionaryShared.cpp line 1727:

> 1725: ArchivedMirrorPatcher::update_array_klasses(k);
> 1726:   }
> 1727:   
> ArchivedMirrorPatcher::update_array_klasses(Universe::fillerArrayKlassObj());

I think this is not necessary. `Universe::fillerArrayKlassObj()` shares the 
same mirror as `Universe::intArrayKlassObj()`, which has already been updated 
in the loop above.

`ArchivedMirrorPatcher::update_array_klasses(k)` will essentially do 
`k->mirror->pointer_back_to_klass += delta`, so it will incorrectly set the 
pointer when delta is not zero.

I would suggest running with


-XX:ArchiveRelocationMode=1 -Xlog:cds -Xlog:class+load=debug


and step into the following code: 


void java_lang_Class::update_archived_mirror_native_pointers(oop 
archived_mirror) {
  assert(MetaspaceShared::relocation_delta() != 0, "must be");

  Klass* k = ((Klass*)archived_mirror->metadata_field(_klass_offset));
  archived_mirror->metadata_field_put(_klass_offset,
  (Klass*)(address(k) + MetaspaceShared::relocation_delta()));  HERE

src/hotspot/share/memory/universe.cpp line 205:

> 203:   }
> 204:   // Hide _fillerArrayKlassObj from JVMTI
> 205:   // closure->do_klass(_fillerArrayKlassObj);

Maybe the comment should be more explicit?

// We don't do the following because it will confuse JVMTI.
// _fillerArrayKlassObj is used only by GC, which doesn't need to see 
// this klass from basic_type_classes_do().

-

PR: https://git.openjdk.java.net/jdk/pull/8156


RFR: 8284336: CDS SignedJar.java test fails due to archived Reference object

2022-04-07 Thread Ioi Lam
During `java -Xshare:dump`, `ClassLoaders.bootLoader().resourceCache` is 
usually null. However, if a signed class is loaded, `resourceCache` will point 
to a `java.lang.ref.SoftReference`. Although rare (we have never seen this 
during our testing), it's possible for `resourceCache.discovered` to directly 
or indirectly point to another `Reference` which may contain an object that 
cannot be archived.

The fix is simple: reset the `resourceCache` field of all three archived 
ClassLoader objects (boot/platform/app).

I cannot reproduce the problem and I am unable to write a deterministic test 
case. However, the bug reporter has tested my preliminary patch and is no 
longer able to reproduce the failure.

Please see the bug report 
[JDK-8284336](https://bugs.openjdk.java.net/browse/JDK-8284336) for detailed 
analysis and traces.

-

Commit messages:
 - 8284336: CDS SignedJar.java test fails due to archived Reference object

Changes: https://git.openjdk.java.net/jdk/pull/8151/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8151&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8284336
  Stats: 28 lines in 5 files changed: 25 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8151.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8151/head:pull/8151

PR: https://git.openjdk.java.net/jdk/pull/8151


Integrated: 8253495: CDS generates non-deterministic output

2022-03-15 Thread Ioi Lam
On Tue, 8 Mar 2022 19:11:02 GMT, Ioi Lam  wrote:

> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

This pull request has now been integrated.

Changeset: de4f04cb
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/de4f04cb71a26ce03b96460cb8d1c1e28cd1ed38
Stats: 100 lines in 15 files changed: 69 ins; 9 del; 22 mod

8253495: CDS generates non-deterministic output

Reviewed-by: erikj, kbarrett, ccheung, ihse

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v8]

2022-03-15 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  Avoid memset twice in os::malloc(); added comments about 
NMTPreInit::handle_malloc vs DumpSharedSpaces

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/cd934f3c..f202bcbf

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=07
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=06-07

  Stats: 7 lines in 1 file changed: 5 ins; 2 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v6]

2022-03-15 Thread Ioi Lam
On Mon, 14 Mar 2022 22:07:24 GMT, Calvin Cheung  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Added helper function CollectedHeap::zap_filler_array_with
>
> test/hotspot/jtreg/runtime/cds/appcds/javaldr/LockDuringDumpAgent.java line 
> 65:
> 
>> 63: if (elapsed >= timeout) {
>> 64: System.out.println("This JVM may decide to not 
>> launch any Java threads during -Xshare:dump.");
>> 65: System.out.println("This is OK because no string 
>> objects be in a locked state during heap dump.");
> 
> Should `no string objects be` be `no string objects could be`?

Thanks for the review. I've fixed the comment as you suggested.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v7]

2022-03-15 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request with a new target base due to a merge or a 
rebase. The incremental webrev excludes the unrelated changes brought in by the 
merge/rebase. The pull request contains 12 additional commits since the last 
revision:

 - fixed copyright
 - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2
 - @calvinccheung review: fixed typo
 - Added helper function CollectedHeap::zap_filler_array_with
 - @kimbarrett comments
 - zero GC heap filler arrays
 - improvement zeroing of alignment gaps
 - Fixed zero build
 - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2
 - fixed test
 - ... and 2 more: https://git.openjdk.java.net/jdk/compare/a6344ded...cd934f3c

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/47e0238a..cd934f3c

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=06
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=05-06

  Stats: 6804 lines in 210 files changed: 3347 ins; 1802 del; 1655 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-11 Thread Ioi Lam
On Fri, 11 Mar 2022 07:13:35 GMT, Thomas Stuefe  wrote:

> Is reproducibility also a topic for users calling -Xdump with custom JNI 
> coding? Or maybe having the VM instrumented somehow? Since it seems such an 
> easy fix, I would prevent attaching too. At least the user would get a clear 
> error message.

It's impossible to execute arbitrary Java code when running "java 
-Xshare:dump", so this means there's no way to load a JNI library when creating 
a *static* CDS archive. The loading of JVMTI agents is also not supported. So 
this is not a case we need to handle. 

During *dynamic* CDS dumps, arbitrary Java code can execute, but we don't have 
a requirement for the *dynamic* CDS archive to be deterministic (at least not 
for now).

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v6]

2022-03-10 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  Added helper function CollectedHeap::zap_filler_array_with

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/584c6572..47e0238a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=04-05

  Stats: 10 lines in 2 files changed: 6 ins; 2 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output

2022-03-10 Thread Ioi Lam
On Fri, 11 Mar 2022 05:59:00 GMT, David Holmes  wrote:

> > I ended up changing `os::malloc()` to zero the buffer when running with 
> > -Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't 
> > matter too much for regular VM executions because `os::malloc()` already 
> > has a high overhead.
> 
> This is raising red flags for me sorry. Every user of the JDK is now paying a 
> penalty because of something only needed when dumping the shared archive. It 
> might not be much but it is the old "death by a thousand cuts". Is there any 
> way to tell the OS to pre-zero all memory provided to the current process, 
> such that we could set that when dumping and not have to check on each 
> allocation?

I don't know how to tell the OS (or C library) to zero out the buffer returned 
by malloc. However, in the current code path, we already have a test for an 
uncommon condition when `os::malloc()` calls `MemTracker::record_malloc()` 
which calls `MallocTracker::record_malloc()`


void* MallocTracker::record_malloc(void* malloc_base, size_t size, MEMFLAGS 
flags,
  const NativeCallStack& stack)
{
  if (MemTracker::tracking_level() == NMT_detail) {
MallocSiteTable::allocation_at(stack, size, &mst_marker, flags);
  }


I can combine the tests for `MemTracker::tracking_level()` and 
`DumpSharedSpaces` into a single test and do more work only when the uncommon 
path is taken. This would require some refactoring of the 
MemTracker/MallocTracker code. I'd rather do that in a separate RFE.

In fact, `MemTracker::_tracking_level` is tested twice in the current 
implementation. We can change it to do a single test in the most common case 
(NMT_summary) if we really want to cut down the number of tests. But honestly I 
don't think this makes any difference.

> And I have to wonder how easy it would be to re-introduce non-deterministic 
> values in these data structures that are being dumped. Does malloc itself 
> even guarantee to return the same set of addresses for the same sequence of 
> requests in different executions of a program?

The malloc'ed objects are copied into the CDS archive at deterministic 
addresses. Any pointers inside such objects will be relocated.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v5]

2022-03-10 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  @kimbarrett comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/be7673af..584c6572

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=04
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=03-04

  Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v4]

2022-03-10 Thread Ioi Lam
On Fri, 11 Mar 2022 05:55:20 GMT, Kim Barrett  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   zero GC heap filler arrays
>
> src/hotspot/share/gc/shared/collectedHeap.cpp line 449:
> 
>> 447:   allocator.initialize(start);
>> 448:   DEBUG_ONLY(zap_filler_array(start, words, zap);)
>> 449:   if (DumpSharedSpaces) {
> 
> Probably shouldn't both zap and clear for dumping, to avoid wasting time.

Fixed.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v4]

2022-03-10 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  zero GC heap filler arrays

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/6974021f..be7673af

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=02-03

  Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-10 Thread Ioi Lam
On Wed, 9 Mar 2022 07:47:19 GMT, Thomas Stuefe  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Fixed zero build
>
> src/hotspot/share/utilities/hashtable.hpp line 42:
> 
>> 40: 
>> 41:   LP64_ONLY(unsigned int _gap;)
>> 42: 
> 
> For 64-bit, you now lose packing potential in the theoretical case the 
> following payload does not have to be aligned to 64 bit. E.g. for T=char, 
> where the whole entry would fit into 8 bytes. Probably does not matter as 
> long as entries are allocated individually from C-heap which is a lot more 
> wasteful anyway. 
> 
> For 32-bit, I think you may have the same problem if the payload starts with 
> a uint64_t. Would that not be aligned to a 64-bit boundary too? Whether or 
> not you build on 64-bit?
> 
> I think setting the memory, or at least the first 8..16 bytes, of the entry 
> to zero in BasicHashtable::new_entry could be more robust:
> 
> (16 bytes in case the payload starts with a long double but that may be 
> overthinking it :)
> 
> 
> template  BasicHashtableEntry* 
> BasicHashtable::new_entry(unsigned int hashValue) {
>   char* p = :new (NEW_C_HEAP_ARRAY(char, this->entry_size(), F);
>   ::memset(p, 0, MIN2(this->entry_size(), 16)); // needs reproducable
>   BasicHashtableEntry* entry = ::new (p) BasicHashtableEntry(hashValue);
>   return entry;
> }
> 
> If you are worried about performance, this may also be controlled by a 
> template parameter, and then you do it just for the system dictionary.

Thanks for pointing this out. I ran more tests and found that on certain 
platforms, there are other structures that have problems with uninitialized 
gaps. I ended up changing `os::malloc()` to zero the buffer when running with 
-Xshare:dump. Hopefully one extra check of `if (DumpSharedSpaces)` doesn't 
matter too much for regular VM executions because `os::malloc()` already has a 
high overhead.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v3]

2022-03-10 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  improvement zeroing of alignment gaps

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/44db40f1..6974021f

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=01-02

  Stats: 8 lines in 2 files changed: 3 ins; 2 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-10 Thread Ioi Lam
On Thu, 10 Mar 2022 13:51:56 GMT, Magnus Ihse Bursie  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Fixed zero build
>
> I think he already did. I'm quoting:
> 
>> However, the CDS archive also contains a heap dump, which includes Java 
>> HashMaps. If I allow those 3 Java threads to start, some HashMaps in the 
>> module graph will have unstable ordering. I think the reason is concurrent 
>> thread execution causes unstable assignment of the identity_hash for objects 
>> in the heap dump.

> @magicus the issue is not the list of classes dumped, or their format in the 
> dump. As Ioi indicated that list is fixed. The issue is with the heap dump 
> part of the archive. Running these other threads affects the heap so by not 
> running them with end up with a different heap. So the question is whether 
> there is anything about having a different heap dumped that we need to be 
> concerned about. We dump the heap into the archive for a reason and this 
> changes what we dump,

To be clear, if multiple threads are running, classes could be loaded in a 
different order and the symbols will have different orders. This would cause 
the vtables in Klass objects to be laid out differently. Trying to fix this is 
very difficult. I have a different patch that makes it work but it's just too 
complicated.

So, disabling the threads is also necessary for (easily) sorting the metaspace 
objects.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-10 Thread Ioi Lam
On Wed, 9 Mar 2022 07:51:46 GMT, Thomas Stuefe  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Fixed zero build
>
> src/hotspot/share/prims/jvm.cpp line 2887:
> 
>> 2885: return;
>> 2886:   }
>> 2887: #endif
> 
> Should we do this for jni_AttachCurrentThread too?

This hasn't been necessary for me because jni_AttachCurrentThread is not called 
during "java -Xshare:dump", which executes under a very strict condition and 
doesn't normally allow arbitrary JNI libraries to be loaded.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-10 Thread Ioi Lam
On Thu, 10 Mar 2022 13:51:56 GMT, Magnus Ihse Bursie  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Fixed zero build
>
> I think he already did. I'm quoting:
> 
>> However, the CDS archive also contains a heap dump, which includes Java 
>> HashMaps. If I allow those 3 Java threads to start, some HashMaps in the 
>> module graph will have unstable ordering. I think the reason is concurrent 
>> thread execution causes unstable assignment of the identity_hash for objects 
>> in the heap dump.

> @magicus I think we need @iklam to weigh in here and explain exactly what the 
> "heap dump" consists of and how not running those threads affects its 
> contents. Presently the heap dump is potentially different on each run, IIUC, 
> only due to the order of its contents, not the contents themselves.

CDS doesn't dump the entire Java heap. Instead, it dumps only a selected 
portion of the Java heap. For example, the module graph. The contents of the 
dumped objects are always the same, except that the identity hashcode may be 
different if multiple threads are executed. The identity hashcode is computed 
here, and its value is "sticky" to the first thread that tries to get the 
hashcode for an object.


static inline intptr_t get_next_hash(Thread* current, oop obj) {
  intptr_t value = 0;
  if (hashCode == 0) {
 ...
  } else if (hashCode == 4) {
 ...
  } else { // default hashCode is 5:
// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.
unsigned t = current->_hashStateX;
t ^= (t << 11);
current->_hashStateX = current->_hashStateY;
current->_hashStateY = current->_hashStateZ;
current->_hashStateZ = current->_hashStateW;
unsigned v = current->_hashStateW;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8));
current->_hashStateW = v;
value = v;
  }


So, when the main Java thread tries to store an object `O` into a hashtable 
inside the module graph, if the hashcode of `O` has already been computed by a 
non-main thread, then the module graph will have unstable contents.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-08 Thread Ioi Lam
On Wed, 9 Mar 2022 07:04:56 GMT, David Holmes  wrote:

> I have reservations about contorting things this way just to get 
> "deterministic output".
> 
> The VM needs to fully initialize and then become quiescent before the dump 
> occurs, and as I say below if you don't start other threads then you 
> potentially remove part of the archive because classes won't be loaded by 
> those threads.
> 
> I think if you care about the order of dumping classes then you should 
> control that order, you don't try to force the order of loading. Can't you 
> sort things before dumping? ie rehash/rebuild the hashtables etc so it has a 
> canonical ordering? I see this was mentioned in the bug report and is 
> considered a largish/complex fix, but it would be the proper fix IMO.
> 
> Thanks, David

I tried the "proper" approach but it's very complicated. I already have an 
implementation that sorts all the metadata. However, the CDS archive also 
contains a heap dump, which includes Java HashMaps. If I allow those 3 Java 
threads to start, some HashMaps in the module graph will have unstable 
ordering. I think the reason is concurrent thread execution causes unstable 
assignment of the identity_hash for objects in the heap dump. I cannot think of 
a clean way to fix this.

The alternative, disabling Java thread starts, is much simpler and much more 
appealing to me.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-08 Thread Ioi Lam
On Wed, 9 Mar 2022 06:49:02 GMT, David Holmes  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Fixed zero build
>
> src/hotspot/share/prims/jvm.cpp line 2873:
> 
>> 2871: // execute in parallel, symbols and classes may be loaded in
>> 2872: // random orders which will make the resulting CDS archive
>> 2873: // non-deterministic.
> 
> Yes but by not starting these threads you are potentially excluding a range 
> of classes from the shared archive!

`java -Xshare:dump` loads all classes specified in a classlist, which is 
created without this thread-disabling hack.

The number of classes in the CDS archive is the same before/after this PR. The 
size of the CDS archive is identical.

-

PR: https://git.openjdk.java.net/jdk/pull/7748


Re: RFR: 8253495: CDS generates non-deterministic output [v2]

2022-03-08 Thread Ioi Lam
> This patch makes the result of "java -Xshare:dump" deterministic:
> - Disabled new Java threads from launching. This is harmless. See comments in 
> jvm.cpp
> - Fixed a problem in hashtable ordering in heapShared.cpp
> - BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
> bits. Added code to zero it.
> - Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
> make/scripts/compare.sh
> 
> Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
> This will be fixed in 
> [JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).
> 
> Testing under way:
> - tier1~tier5
> - Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
> windows-x86-cmp-baseline,  etc).

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  Fixed zero build

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7748/files
  - new: https://git.openjdk.java.net/jdk/pull/7748/files/1fb3f830..44db40f1

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=00-01

  Stats: 5 lines in 2 files changed: 3 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


RFR: 8253495: CDS generates non-deterministic output

2022-03-08 Thread Ioi Lam
This patch makes the result of "java -Xshare:dump" deterministic:
- Disabled new Java threads from launching. This is harmless. See comments in 
jvm.cpp
- Fixed a problem in hashtable ordering in heapShared.cpp
- BasicHashtableEntry has a gap on 64-bit platforms that may contain random 
bits. Added code to zero it.
- Enabled checking of $JAVA_HOME/lib/server/classes.jsa in 
make/scripts/compare.sh

Note:  $JAVA_HOME/lib/server/classes_ncoops.jsa is still non-deterministic. 
This will be fixed in 
[JDK-8282828](https://bugs.openjdk.java.net/browse/JDK-8282828).

Testing under way:
- tier1~tier5
- Run all *-cmp-baseline jobs 20 times each (linux-aarch64-cmp-baseline, 
windows-x86-cmp-baseline,  etc).

-

Commit messages:
 - Merge branch 'master' into 8253495-cds-generateds-non-deterministic-output-2
 - fixed test
 - more fixes
 - 8253495: CDS generates non-deterministic output

Changes: https://git.openjdk.java.net/jdk/pull/7748/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7748&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8253495
  Stats: 73 lines in 12 files changed: 49 ins; 7 del; 17 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7748.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7748/head:pull/7748

PR: https://git.openjdk.java.net/jdk/pull/7748


Integrated: 8275731: CDS archived enums objects are recreated at runtime

2022-02-28 Thread Ioi Lam
On Wed, 1 Dec 2021 20:47:20 GMT, Ioi Lam  wrote:

> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in 
> an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
> public static final SUNDAY = new Day("SUNDAY");
> public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::` is executed twice: once 
> during `java -Xshare:dump`, and once during normal JVM execution. If the 
> archived heap objects references one of the Enum constants created at dump 
> time, we will violate the uniqueness requirements of the Enum constants at 
> runtime. See the test case in the description of 
> [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is 
> archived, we archive all constants of type X. At Runtime, type X will skip 
> the normal execution of `X::`. Instead, we run 
> `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
> were saved at dump time.
> 
> This is safe as we know that `X::` has no observable side effect -- 
> it only creates the constants of type X, as well as the synthetic value 
> `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
> similar problems where the archived heap objects reference a static field 
> that may be recreated at runtime. There are some manual steps involved, but I 
> analyzed the potential problems found by the tool are they are all safe 
> (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. 
> An example trace of this tool can be found at 
> https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

This pull request has now been integrated.

Changeset: d983d108
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/d983d108c565654e717e2811d88aa94d982da2f5
Stats: 860 lines in 16 files changed: 807 ins; 4 del; 49 mod

8275731: CDS archived enums objects are recreated at runtime

Reviewed-by: coleenp, ccheung

-

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v4]

2022-02-28 Thread Ioi Lam
On Thu, 17 Feb 2022 23:20:41 GMT, Calvin Cheung  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Use InstanceKlass::do_local_static_fields for some field iterations
>
> Looks good. Minor comment below.
> Also, several files with copyright year 2021 need updating.

Thanks @calvinccheung and @coleenp for the review. Passed tiers 1-5.

-

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v7]

2022-02-27 Thread Ioi Lam
> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in 
> an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
> public static final SUNDAY = new Day("SUNDAY");
> public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::` is executed twice: once 
> during `java -Xshare:dump`, and once during normal JVM execution. If the 
> archived heap objects references one of the Enum constants created at dump 
> time, we will violate the uniqueness requirements of the Enum constants at 
> runtime. See the test case in the description of 
> [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is 
> archived, we archive all constants of type X. At Runtime, type X will skip 
> the normal execution of `X::`. Instead, we run 
> `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
> were saved at dump time.
> 
> This is safe as we know that `X::` has no observable side effect -- 
> it only creates the constants of type X, as well as the synthetic value 
> `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
> similar problems where the archived heap objects reference a static field 
> that may be recreated at runtime. There are some manual steps involved, but I 
> analyzed the potential problems found by the tool are they are all safe 
> (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. 
> An example trace of this tool can be found at 
> https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request with a new target base due to a merge or a 
rebase. The pull request now contains 10 commits:

 - fixed copyright year
 - Merge branch 'master' into 8275731-heapshared-enum
 - fixed whitespace
 - Fixed comments per @calvinccheung review
 - Merge branch 'master' into 8275731-heapshared-enum
 - Use InstanceKlass::do_local_static_fields for some field iterations
 - Merge branch 'master' into 8275731-heapshared-enum
 - added exclusions needed by "java -Xshare:dump -ea -esa"
 - Comments from @calvinccheung off-line
 - 8275731: CDS archived enums objects are recreated at runtime

-

Changes: https://git.openjdk.java.net/jdk/pull/6653/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=06
  Stats: 860 lines in 16 files changed: 807 ins; 4 del; 49 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v6]

2022-02-22 Thread Ioi Lam
> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in 
> an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
> public static final SUNDAY = new Day("SUNDAY");
> public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::` is executed twice: once 
> during `java -Xshare:dump`, and once during normal JVM execution. If the 
> archived heap objects references one of the Enum constants created at dump 
> time, we will violate the uniqueness requirements of the Enum constants at 
> runtime. See the test case in the description of 
> [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is 
> archived, we archive all constants of type X. At Runtime, type X will skip 
> the normal execution of `X::`. Instead, we run 
> `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
> were saved at dump time.
> 
> This is safe as we know that `X::` has no observable side effect -- 
> it only creates the constants of type X, as well as the synthetic value 
> `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
> similar problems where the archived heap objects reference a static field 
> that may be recreated at runtime. There are some manual steps involved, but I 
> analyzed the potential problems found by the tool are they are all safe 
> (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. 
> An example trace of this tool can be found at 
> https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  fixed whitespace

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6653/files
  - new: https://git.openjdk.java.net/jdk/pull/6653/files/4764075e..c6e9be1d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=05
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=04-05

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v5]

2022-02-22 Thread Ioi Lam
> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in 
> an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
> public static final SUNDAY = new Day("SUNDAY");
> public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::` is executed twice: once 
> during `java -Xshare:dump`, and once during normal JVM execution. If the 
> archived heap objects references one of the Enum constants created at dump 
> time, we will violate the uniqueness requirements of the Enum constants at 
> runtime. See the test case in the description of 
> [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is 
> archived, we archive all constants of type X. At Runtime, type X will skip 
> the normal execution of `X::`. Instead, we run 
> `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
> were saved at dump time.
> 
> This is safe as we know that `X::` has no observable side effect -- 
> it only creates the constants of type X, as well as the synthetic value 
> `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
> similar problems where the archived heap objects reference a static field 
> that may be recreated at runtime. There are some manual steps involved, but I 
> analyzed the potential problems found by the tool are they are all safe 
> (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. 
> An example trace of this tool can be found at 
> https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request with a new target base due to a merge or a 
rebase. The pull request now contains seven commits:

 - Fixed comments per @calvinccheung review
 - Merge branch 'master' into 8275731-heapshared-enum
 - Use InstanceKlass::do_local_static_fields for some field iterations
 - Merge branch 'master' into 8275731-heapshared-enum
 - added exclusions needed by "java -Xshare:dump -ea -esa"
 - Comments from @calvinccheung off-line
 - 8275731: CDS archived enums objects are recreated at runtime

-

Changes: https://git.openjdk.java.net/jdk/pull/6653/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=04
  Stats: 850 lines in 16 files changed: 807 ins; 2 del; 41 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3]

2022-02-16 Thread Ioi Lam
On Wed, 19 Jan 2022 05:50:50 GMT, Ioi Lam  wrote:

>> I don't really know this code well enough to do a good code review.  I had 
>> some comments though.
>
>> I don't really know this code well enough to do a good code review. I had 
>> some comments though.
> 
> Hi Coleen, thanks for taking a look.
> 
> This PR has two major parts:
> 
> 1. Check for inappropriate reference to static fields. This is mainly done in 
> cdsHeapVerifier.cpp. These checks don't affect the contents of the CDS 
> archive. They just print out warnings if problems are found.
> 2. Special initialization of enum classes. Essentially if any instance of an 
> enum class `X` is archived, then `X::` will not be executed, and 
> we'll take this path instead (in instanceKlass.cpp):
> 
> 
>   // This is needed to ensure the consistency of the archived heap objects.
>   if (has_archived_enum_objs()) {
> assert(is_shared(), "must be");
> bool initialized = HeapShared::initialize_enum_klass(this, CHECK);
> if (initialized) {
>   return;
> }
>   }
> 
> Could you check if (2) is correct?

> @iklam This pull request has been inactive for more than 4 weeks and will be 
> automatically closed if another 4 weeks passes without any activity. To avoid 
> this, simply add a new comment to the pull request. Feel free to ask for 
> assistance if you need help with progressing this pull request towards 
> integration!

keepalive

-

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3]

2022-01-18 Thread Ioi Lam
On Mon, 17 Jan 2022 19:22:23 GMT, Coleen Phillimore  wrote:

> I don't really know this code well enough to do a good code review. I had 
> some comments though.

Hi Coleen, thanks for taking a look.

This PR has two major parts:

1. Check for inappropriate reference to static fields. This is mainly done in 
cdsHeapVerifier.cpp. These checks don't affect the contents of the CDS archive. 
They just print out warnings if problems are found.
2. Special initialization of enum classes. Essentially if any instance of an 
enum class `X` is archived, then `X::` will not be executed, and we'll 
take this path instead (in instanceKlass.cpp):


  // This is needed to ensure the consistency of the archived heap objects.
  if (has_archived_enum_objs()) {
assert(is_shared(), "must be");
bool initialized = HeapShared::initialize_enum_klass(this, CHECK);
if (initialized) {
  return;
}
  }

Could you check if (2) is correct?

-

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3]

2022-01-18 Thread Ioi Lam
On Mon, 17 Jan 2022 18:36:35 GMT, Coleen Phillimore  wrote:

>> Ioi Lam has updated the pull request with a new target base due to a merge 
>> or a rebase. The incremental webrev excludes the unrelated changes brought 
>> in by the merge/rebase. The pull request contains four additional commits 
>> since the last revision:
>> 
>>  - Merge branch 'master' into 8275731-heapshared-enum
>>  - added exclusions needed by "java -Xshare:dump -ea -esa"
>>  - Comments from @calvinccheung off-line
>>  - 8275731: CDS archived enums objects are recreated at runtime
>
> src/hotspot/share/cds/cdsHeapVerifier.cpp line 165:
> 
>> 163: 
>> 164: ResourceMark rm;
>> 165: for (JavaFieldStream fs(ik); !fs.done(); fs.next()) {
> 
> Can this call instead
> void InstanceKlass::do_local_static_fields(void f(fieldDescriptor*, Handle, 
> TRAPS), Handle mirror, TRAPS) {
> and have this next few lines in the function?

I moved the code inside a new class CDSHeapVerifier::CheckStaticFields so I can 
call  InstanceKlass::do_local_static_fields

> src/hotspot/share/cds/cdsHeapVerifier.cpp line 254:
> 
>> 252:   InstanceKlass* ik = InstanceKlass::cast(k);
>> 253:   for (JavaFieldStream fs(ik); !fs.done(); fs.next()) {
>> 254: if (!fs.access_flags().is_static()) {
> 
> same here.  It only saves a couple of lines but then you can have the 
> function outside this large function.

You actually found a bug here. I am iterating over non-static fields and should 
walk the inherited fields as well. I changed the code to call 
InstanceKlass::do_nonstatic_fields()

> src/hotspot/share/cds/cdsHeapVerifier.hpp line 52:
> 
>> 50:   mtClassShared,
>> 51:   HeapShared::oop_hash> _table;
>> 52: 
> 
> Is this only used inside cdsHeapVerifier?  if so it should be in the .cpp 
> file. There's also an ArchiveableStaticFieldInfo.  Not sure how they are 
> related.

This `_table` is part of the CDSHeapVerifier instance, which is stack 
allocated. So I need to declare it as part of the CDSHeapVerifier class 
declaration in the hpp file.

> src/hotspot/share/cds/heapShared.cpp line 433:
> 
>> 431:   oop mirror = k->java_mirror();
>> 432:   int i = 0;
>> 433:   for (JavaFieldStream fs(k); !fs.done(); fs.next()) {
> 
> This seems like it should also use InstanceKlass::do_local_static_fields.

Converting this to InstanceKlass::do_nonstatic_fields() is difficult because 
the loop body references 7 different variables declared outside of the loop. 

One thing I tried is to add a new version of do_nonstatic_fields2() that 
supports C++ lambdas. You can see my experiment from here: 

https://github.com/openjdk/jdk/compare/master...iklam:lambda-for-instanceklass-do_local_static_fields2?expand=1

I changed all my new code to use the do_nonstatic_fields2() function with 
lambda.

> src/hotspot/share/cds/heapShared.cpp line 482:
> 
>> 480: copy_open_objects(open_regions);
>> 481: 
>> 482: CDSHeapVerifier::verify();
> 
> Should all this be DEBUG_ONLY ?

I changed CDSHeapVerifier::verify() to a NOT_DEBUG_RETURN function.

> src/hotspot/share/cds/heapShared.hpp line 236:
> 
>> 234: oop _referrer;
>> 235: oop _obj;
>> 236: CachedOopInfo() :_subgraph_info(), _referrer(), _obj() {}
> 
> Should these be initialized to nullptr? does this do this?

These three fields are initialized with the default initializer (empty 
parenthesis) so they will be initialized to the null pointer.

-

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v4]

2022-01-18 Thread Ioi Lam
> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in 
> an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
> public static final SUNDAY = new Day("SUNDAY");
> public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::` is executed twice: once 
> during `java -Xshare:dump`, and once during normal JVM execution. If the 
> archived heap objects references one of the Enum constants created at dump 
> time, we will violate the uniqueness requirements of the Enum constants at 
> runtime. See the test case in the description of 
> [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is 
> archived, we archive all constants of type X. At Runtime, type X will skip 
> the normal execution of `X::`. Instead, we run 
> `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
> were saved at dump time.
> 
> This is safe as we know that `X::` has no observable side effect -- 
> it only creates the constants of type X, as well as the synthetic value 
> `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
> similar problems where the archived heap objects reference a static field 
> that may be recreated at runtime. There are some manual steps involved, but I 
> analyzed the potential problems found by the tool are they are all safe 
> (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. 
> An example trace of this tool can be found at 
> https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  Use InstanceKlass::do_local_static_fields for some field iterations

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6653/files
  - new: https://git.openjdk.java.net/jdk/pull/6653/files/6e160057..e27d3523

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=02-03

  Stats: 150 lines in 2 files changed: 82 ins; 59 del; 9 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653


[Ping] RFR: 8275731: CDS archived enums objects are recreated at runtime

2022-01-04 Thread Ioi Lam

Still looking for reviewers 

Thanks
- Ioi

On 12/1/21 1:02 PM, Ioi Lam wrote:

**Background:**

In the Java Language, Enums can be tested for equality, so the constants in an 
Enum type must be unique. Javac compiles an enum declaration like this:


public enum Day {  SUNDAY, MONDAY ... }


to


public class Day extends java.lang.Enum {
 public static final SUNDAY = new Day("SUNDAY");
 public static final MONDAY = new Day("MONDAY"); ...
}


With CDS archived heap objects, `Day::` is executed twice: once during 
`java -Xshare:dump`, and once during normal JVM execution. If the archived heap 
objects references one of the Enum constants created at dump time, we will violate 
the uniqueness requirements of the Enum constants at runtime. See the test case in 
the description of [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)

**Fix:**

During -Xshare:dump, if we discovered that an Enum constant of type X is archived, we 
archive all constants of type X. At Runtime, type X will skip the normal execution of 
`X::`. Instead, we run `HeapShared::initialize_enum_klass()` to 
retrieve all the constants of X that were saved at dump time.

This is safe as we know that `X::` has no observable side effect -- it 
only creates the constants of type X, as well as the synthetic value `X::$VALUES`, 
which cannot be observed until X is fully initialized.

**Verification:**

To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
similar problems where the archived heap objects reference a static field that 
may be recreated at runtime. There are some manual steps involved, but I 
analyzed the potential problems found by the tool are they are all safe (after 
the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example 
trace of this tool can be found at 
https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt

**Testing:**

Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

-

Commit messages:
  - 8275731: CDS archived enums objects are recreated at runtime

Changes: https://git.openjdk.java.net/jdk/pull/6653/files
  Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=00
   Issue: https://bugs.openjdk.java.net/browse/JDK-8275731
   Stats: 829 lines in 16 files changed: 787 ins; 2 del; 40 mod
   Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
   Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653




Re: RFR: 8275731: CDS archived enums objects are recreated at runtime [v3]

2021-12-10 Thread Ioi Lam
> **Background:**
> 
> In the Java Language, Enums can be tested for equality, so the constants in 
> an Enum type must be unique. Javac compiles an enum declaration like this:
> 
> 
> public enum Day {  SUNDAY, MONDAY ... } 
> 
> 
> to
> 
> 
> public class Day extends java.lang.Enum {
> public static final SUNDAY = new Day("SUNDAY");
> public static final MONDAY = new Day("MONDAY"); ...
> }
> 
> 
> With CDS archived heap objects, `Day::` is executed twice: once 
> during `java -Xshare:dump`, and once during normal JVM execution. If the 
> archived heap objects references one of the Enum constants created at dump 
> time, we will violate the uniqueness requirements of the Enum constants at 
> runtime. See the test case in the description of 
> [JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)
> 
> **Fix:**
> 
> During -Xshare:dump, if we discovered that an Enum constant of type X is 
> archived, we archive all constants of type X. At Runtime, type X will skip 
> the normal execution of `X::`. Instead, we run 
> `HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
> were saved at dump time.
> 
> This is safe as we know that `X::` has no observable side effect -- 
> it only creates the constants of type X, as well as the synthetic value 
> `X::$VALUES`, which cannot be observed until X is fully initialized.
> 
> **Verification:**
> 
> To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
> similar problems where the archived heap objects reference a static field 
> that may be recreated at runtime. There are some manual steps involved, but I 
> analyzed the potential problems found by the tool are they are all safe 
> (after the current bug is fixed). See cdsHeapVerifier.cpp for gory details. 
> An example trace of this tool can be found at 
> https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt
> 
> **Testing:**
> 
> Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

Ioi Lam has updated the pull request with a new target base due to a merge or a 
rebase. The incremental webrev excludes the unrelated changes brought in by the 
merge/rebase. The pull request contains four additional commits since the last 
revision:

 - Merge branch 'master' into 8275731-heapshared-enum
 - added exclusions needed by "java -Xshare:dump -ea -esa"
 - Comments from @calvinccheung off-line
 - 8275731: CDS archived enums objects are recreated at runtime

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6653/files
  - new: https://git.openjdk.java.net/jdk/pull/6653/files/df0d3f88..6e160057

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=01-02

  Stats: 24204 lines in 951 files changed: 16523 ins; 3176 del; 4505 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-12-01 Thread Ioi Lam




On 11/7/21 9:44 PM, Jaikiran Pai wrote:

Hello Ioi,

On 02/11/21 12:13 am, Ioi Lam wrote:

Hi Jaikiran,

Thanks for writing the test case to explore the problems in this area.

Please see my comments below:
...

Generally speaking, CDS has two levels of archiving:

[1] archiving class metadata -- classes in the 
$JAVA_HOME/lib/classlist are considered to be frequently loaded 
classes. They are parsed from classfiles and stored into the CDS 
archive. At run time, instead of parsing the classes from classfiles, 
the VM directly use the pre-parsed version of these classes (as 
InstanceKlass* in C++).


At runtime, all such pre-parsed classes are initially in the "loaded" 
state. This means their static constructors will be executed when 
these classes are referenced for the first time. So as far as Java 
semantic is concerned, there's no difference between a pre-parsed 
class vs a class loaded from classfile.


E.g, the examples of loggers in static initializers will be executed 
at runtime.


[2] archiving heap objects

As shown in your test, we cannot arbitrarily archive the static 
fields that were initialized during -Xshare:dump, because they may 
have environment dependency.


The strategy used by CDS is to archive only a few static fields in a 
small number of carefully hand-picked system classes. You can see the 
list in


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/cds/heapShared.cpp*L97__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVBMi-78hA$ 



Thank you for that link. That helped. So essentially even though the 
list of classes used for archiving class metadata isn't very tightly 
controlled, the list of objects which are archived in the heap is much 
more selective.


The reason why my PoC ended up reproducing this issue is because it 
just so happened that I selected a class (ModuleDescriptor) which 
(indirectly) is hand-picked in that list of classes that can end up in 
the archived heap.


These static fields are stored into the CDS archive. At run time, 
these fields are essentially copied into the Java heap, and then 
picked up by code like this:


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/java.base/share/classes/jdk/internal/module/ModuleBootstrap.java*L163__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVABkO6Q0w$ 



    public static ModuleLayer boot() {
    Counters.start();

    ModuleLayer bootLayer;
    ArchivedBootLayer archivedBootLayer = ArchivedBootLayer.get();
    if (archivedBootLayer != null) {
    assert canUseArchivedBootLayer();
    bootLayer = archivedBootLayer.bootLayer();
    BootLoader.getUnnamedModule(); // trigger  of 
BootLoader.
CDS.defineArchivedModules(ClassLoaders.platformClassLoader(), 
ClassLoaders.appClassLoader());


    // assume boot layer has at least one module providing a 
service

    // that is mapped to the application class loader.
    JLA.bindToLoader(bootLayer, ClassLoaders.appClassLoader());
    } else {
    bootLayer = boot2();
    }

In the case of the module graph, we remove things that depend on the 
environment (such as CLASSPATH)


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/cds/heapShared.cpp*L190__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVAx46l0Bg$ 



The remaining parts of the archived module graph only depend on the 
following system properties:


    private static boolean canUseArchivedBootLayer() {
    return getProperty("jdk.module.upgrade.path") == null &&
   getProperty("jdk.module.path") == null &&
   getProperty("jdk.module.patch.0") == null &&   // 
--patch-module
   getProperty("jdk.module.main") == null &&  // 
--module
   getProperty("jdk.module.addmods.0") == null &&    // 
--add-modules
   getProperty("jdk.module.limitmods") == null && // 
--limit-modules
   getProperty("jdk.module.addreads.0") == null &&    // 
--add-reads
   getProperty("jdk.module.addexports.0") == null &&  // 
--add-exports
   getProperty("jdk.module.addopens.0") == null; // 
--add-opens

    }

As a result, we will invalidate the archived module graph if these 
properties differ between dump time and run time. The Java code above 
only asserts that the check has already been done. The actual check 
is done in here:


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/runtime/arguments.cpp*L1339__;Iw!!

RFR: 8275731: CDS archived enums objects are recreated at runtime

2021-12-01 Thread Ioi Lam
**Background:**

In the Java Language, Enums can be tested for equality, so the constants in an 
Enum type must be unique. Javac compiles an enum declaration like this:


public enum Day {  SUNDAY, MONDAY ... } 


to


public class Day extends java.lang.Enum {
public static final SUNDAY = new Day("SUNDAY");
public static final MONDAY = new Day("MONDAY"); ...
}


With CDS archived heap objects, `Day::` is executed twice: once during 
`java -Xshare:dump`, and once during normal JVM execution. If the archived heap 
objects references one of the Enum constants created at dump time, we will 
violate the uniqueness requirements of the Enum constants at runtime. See the 
test case in the description of 
[JDK-8275731](https://bugs.openjdk.java.net/browse/JDK-8275731)

**Fix:**

During -Xshare:dump, if we discovered that an Enum constant of type X is 
archived, we archive all constants of type X. At Runtime, type X will skip the 
normal execution of `X::`. Instead, we run 
`HeapShared::initialize_enum_klass()` to retrieve all the constants of X that 
were saved at dump time.

This is safe as we know that `X::` has no observable side effect -- it 
only creates the constants of type X, as well as the synthetic value 
`X::$VALUES`, which cannot be observed until X is fully initialized.

**Verification:**

To avoid future problems, I added a new tool, CDSHeapVerifier, to look for 
similar problems where the archived heap objects reference a static field that 
may be recreated at runtime. There are some manual steps involved, but I 
analyzed the potential problems found by the tool are they are all safe (after 
the current bug is fixed). See cdsHeapVerifier.cpp for gory details. An example 
trace of this tool can be found at 
https://bugs.openjdk.java.net/secure/attachment/97242/enum_warning.txt

**Testing:**

Passed Oracle CI tiers 1-4. WIll run tier 5 as well.

-

Commit messages:
 - 8275731: CDS archived enums objects are recreated at runtime

Changes: https://git.openjdk.java.net/jdk/pull/6653/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6653&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8275731
  Stats: 829 lines in 16 files changed: 787 ins; 2 del; 40 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6653.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6653/head:pull/6653

PR: https://git.openjdk.java.net/jdk/pull/6653


Re: [External] : Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-11-01 Thread Ioi Lam

Hi Jaikiran,

Thanks for writing the test case to explore the problems in this area.

Please see my comments below:

On 10/29/21 5:55 AM, Jaikiran Pai wrote:

Hello Ioi (and others),

On 22/10/21 1:39 pm, Jaikiran Pai wrote:

Hello Ioi,

On 22/10/21 12:29 pm, Ioi Lam wrote:



On 10/21/21 9:09 PM, Jaikiran Pai wrote:

Hello Ioi,




This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen 
elsewhere?


The static constructors of enum classes are executed at both CDS 
dump time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived 
heap during dump time, it will be different than the value of 
Modifier.OPEN that is re-created at runtime by the execution of 
Modifier.


I have almost next to nothing knowledge about CDS internals. My 
only understanding of it is based on some documentation that I have 
read. One of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" 
like it states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple 
JVM processes."


But from what you explain in that enum example, it looks like it 
also stores class instance data that is computed at build time on 
the host where the JDK image was generated? Did I understand it 
correctly? Is this only for enums or does it also store the static 
initialization data of "class" types too? If it does store the 
static init data of class types too, then wouldn't such data be 
host/build time specific and as such the classes that need to be 
enrolled into the default CDS archive of the JDK should be very 
selective (by reviewing what they do in their static init)? Like I 
said, I haven't looked into this in detail so perhaps it already is 
selective in the JDK build?




Hi Jaikiran,


Thank you very much for the detailed response.

CDS also has the ability to archive Java heap object. Since 
https://bugs.openjdk.java.net/browse/JDK-8244778 , we have archived 
the entire module graph to improve start-up time. At run time, the 
module graph (as well as other archived heap objects) are loaded 
from the CDS archive and put into the Java heap (either through 
memory mapping or copying).


That is interesting and something that I hadn't known.



You can see the related code in 
jdk.internal.module.ModuleBootstrap::boot()
I just had a look at it and it's quite elaborate and it'll take a me 
while to fully grasp it (if at all) given its understandable complexity.


When the module system has started up, the module graph will 
reference a copy of the OPEN enum object that was created as part of 
the archive. However, the Modifier. will also be executed at 
VM start-up, causing a second copy of the OPEN enum object to be 
stored into the static field Modified::OPEN.


Thank you for that detail. That helps me understand this a bit more 
(and opens a few questions). To be clear - the VM startup code which 
creates that other copy, ends up creating that copy because that 
piece of initialization happens even before the module system has 
fully started up and created those references from the archived 
state? Otherwise, the classloaders I believe would be smart enough to 
not run that static init again, had the module system with that graph 
from the archived state been fully "up"?


So would this mean that this not just impacts enums but essentially 
every class referenced within the module system (of just boot layer?) 
that has state which is initialized during static init? To be more 
precise, consider the very common example of loggers which are 
typically static initialized and stored in a static (final) field:


private static final java.util.logger.Logger logger = 
Logger.getLogger(SomeClass.class);


If the boot layer module graph has any classes which has state like 
this, it would then mean that if such classes do get initialized very 
early on during VM startup, then they too are impacted and the module 
graph holding instances of such classes will end up using a different 
instance for such fields as compared to the rest of the application 
code?


In essence, such classes which get accessed early (before module 
system with the archived graph is "up") during VM startup can end up 
_virtually_ having their static initialization run twice (I 
understand it won't be run twice, but that's the end result, isn't it)?


I was really curious why this was only applicable to enums and why 
other static initial

Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-22 Thread Ioi Lam




On 10/21/21 9:09 PM, Jaikiran Pai wrote:

Hello Ioi,




This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen elsewhere?

The static constructors of enum classes are executed at both CDS dump 
time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived 
heap during dump time, it will be different than the value of 
Modifier.OPEN that is re-created at runtime by the execution of 
Modifier.


I have almost next to nothing knowledge about CDS internals. My only 
understanding of it is based on some documentation that I have read. 
One of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" like 
it states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple JVM 
processes."


But from what you explain in that enum example, it looks like it also 
stores class instance data that is computed at build time on the host 
where the JDK image was generated? Did I understand it correctly? Is 
this only for enums or does it also store the static initialization 
data of "class" types too? If it does store the static init data of 
class types too, then wouldn't such data be host/build time specific 
and as such the classes that need to be enrolled into the default CDS 
archive of the JDK should be very selective (by reviewing what they do 
in their static init)? Like I said, I haven't looked into this in 
detail so perhaps it already is selective in the JDK build?




Hi Jaikiran,

CDS also has the ability to archive Java heap object. Since 
https://bugs.openjdk.java.net/browse/JDK-8244778 , we have archived the 
entire module graph to improve start-up time. At run time, the module 
graph (as well as other archived heap objects) are loaded from the CDS 
archive and put into the Java heap (either through memory mapping or 
copying).


You can see the related code in jdk.internal.module.ModuleBootstrap::boot()

When the module system has started up, the module graph will reference a 
copy of the OPEN enum object that was created as part of the archive. 
However, the Modifier. will also be executed at VM start-up, 
causing a second copy of the OPEN enum object to be stored into the 
static field Modified::OPEN.


When an application tries to get the OPEN enum, it will get the copy 
stored in the static field Modified::OPEN. If you walk the module graph, 
you get get a reference to the other copy. Since these are two distinct 
heap objects, the == operator will return comparing them.


Thanks
- Ioi





Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-21 Thread Ioi Lam




On 10/21/21 5:25 AM, Alan Bateman wrote:

On 21/10/2021 10:49, Jaikiran Pai wrote:

:

Digging into it, it appears that since the ModuleDescriptor#equals() 
calls equals() on enum types (in this specific case on 
ModuleDescriptor.Requires.Modifier) and since enum type equality is 
implemented as identity checks, those identity checks are 
surprisingly failing. More specifically 
ModuleDescriptor.Requires.Modifier.MANDATED == 
ModuleDescriptor.Requires.Modifier.MANDATED is equating to false 
because at runtime I see that two different instances of 
ModuleDescriptor.Requires.Modifier.MANDATED have been loaded (by the 
same boot module classloader). Although I use 
ModuleDescriptor.Requires.Modifier.MANDATED as an example, the same 
is reproducible with other enum values like 
ModuleDescriptor.Requires.Modifier.TRANSITIVE.


This appears to be specific to CDS since running the above program with:

java -Xshare:off EnumEquality

succeeds and the ModuleDescriptor equality check passes.

In short, it looks like there is some general issue with CDS and 
equality checks with enums and perhaps deserves a separate JBS issue?
I've asked Ioi Lam to comment on this, off-hand I'm not aware of any 
issues with CDS here but it may be related to the archiving of object 
graphs.


-Alan


Hi Jaikiran and Alan,

Thanks for reporting this issue. It's a bug in CDS. I have filed 
https://bugs.openjdk.java.net/browse/JDK-8275731 and am working on a fix.


This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen elsewhere?

The static constructors of enum classes are executed at both CDS dump 
time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived heap 
during dump time, it will be different than the value of Modifier.OPEN 
that is re-created at runtime by the execution of Modifier.


Thanks
- Ioi


Re: RFR: 8272600: (test) Use native "sleep" in Basic.java [v4]

2021-09-15 Thread Ioi Lam
On Fri, 3 Sep 2021 14:25:53 GMT, Roger Riggs  wrote:

>> The intermittent test in java/lang/ProcessBuilder/Basic.java has identified 
>> unexpected messages from a child Java VM
>> as the cause of the test failure.  Attempts to control the output of the 
>> child VM have failed, the VM is unrepentant .
>> 
>> There is no functionality in the child except to wait long enough for the 
>> test to finish and the child is destroyed.
>> The fix is to switch from using a Java child to using a native child; a new 
>> executable `sleepmillis`.
>
> Roger Riggs has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Revert to using BasicSleep on Windows
>   Added diagnostic output for a test that sometimes fails on Linux when using 
> /bin/sleep.
>   Addressed review comments.

Marked as reviewed by iklam (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/5239


Re: RFR: JDK-8273526: Extend the OSContainer API pids controller with pids.current [v4]

2021-09-15 Thread Ioi Lam
On Wed, 15 Sep 2021 07:46:34 GMT, Matthias Baesken  wrote:

>> https://bugs.openjdk.java.net/browse/JDK-8266490
>> extended the OSContainer API in order to also support the pids controller of 
>> cgroups. However only pids.max output was added with 8266490.
>> There is a second parameter pids.current , see 
>> https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#pid
>> that would be helpful too and can be added to the OSContainer API .
>> pids.current :
>> A read-only single value file which exists on all cgroups.
>> The number of processes currently in the cgroup and its descendants.
>> 
>> Best regards, Matthias
>
> Matthias Baesken has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Adjust TestPids

Marked as reviewed by iklam (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/5437


Re: RFR: JDK-8273526: Extend the OSContainer API pids controller with pids.current [v3]

2021-09-14 Thread Ioi Lam
On Tue, 14 Sep 2021 16:36:24 GMT, Ioi Lam  wrote:

>> Matthias Baesken has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Simplify coding following Severins advice
>
> test/hotspot/jtreg/containers/docker/TestPids.java line 97:
> 
>> 95: System.out.println("DEBUG: parts.length = " + 
>> parts.length);
>> 96: if (expectedValue.equals("no_value_expected")) {
>> 97: Asserts.assertEquals(parts.length, 2);
> 
> Is "no_value_expected" generated by Docker? I searched the entire HotSpot 
> source code and couldn't find it. I also couldn't find "WARNING: Your kernel 
> does not support pids limit capabilities". 
> 
> To make it easier to understand this test, I would suggest grouping all 
> messages that were generated outside of HotSpot into something like:
> 
> 
> // These messages are generated by Docker
> static final String warning_kernel_no_pids_support = "WARNING: Your kernel 
> does not support pids limit capabilities";
> static final String no_value_expected = "no_value_expected";

Oh, I misunderstood the test. "no_value_expected" was passed in from 
`testPids()` in this file, but that's confusing because you are expecting an 
integer value. Maybe it should be "any_integer"?

-

PR: https://git.openjdk.java.net/jdk/pull/5437


Re: RFR: JDK-8273526: Extend the OSContainer API pids controller with pids.current [v3]

2021-09-14 Thread Ioi Lam
On Tue, 14 Sep 2021 14:27:36 GMT, Matthias Baesken  wrote:

>> https://bugs.openjdk.java.net/browse/JDK-8266490
>> extended the OSContainer API in order to also support the pids controller of 
>> cgroups. However only pids.max output was added with 8266490.
>> There is a second parameter pids.current , see 
>> https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#pid
>> that would be helpful too and can be added to the OSContainer API .
>> pids.current :
>> A read-only single value file which exists on all cgroups.
>> The number of processes currently in the cgroup and its descendants.
>> 
>> Best regards, Matthias
>
> Matthias Baesken has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Simplify coding following Severins advice

HotSpot changes look good to me. I have a comment on the test.

test/hotspot/jtreg/containers/docker/TestPids.java line 97:

> 95: System.out.println("DEBUG: parts.length = " + 
> parts.length);
> 96: if (expectedValue.equals("no_value_expected")) {
> 97: Asserts.assertEquals(parts.length, 2);

Is "no_value_expected" generated by Docker? I searched the entire HotSpot 
source code and couldn't find it. I also couldn't find "WARNING: Your kernel 
does not support pids limit capabilities". 

To make it easier to understand this test, I would suggest grouping all 
messages that were generated outside of HotSpot into something like:


// These messages are generated by Docker
static final String warning_kernel_no_pids_support = "WARNING: Your kernel does 
not support pids limit capabilities";
static final String no_value_expected = "no_value_expected";

-

PR: https://git.openjdk.java.net/jdk/pull/5437


Re: RFR: 8273092: Sort classlist in JDK image [v2]

2021-08-31 Thread Ioi Lam
On Sat, 28 Aug 2021 13:18:37 GMT, Claes Redestad  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   @dfuch comments
>
> Seems OK.

Thanks @cl4es @magicus @dfuch for the review!

-

PR: https://git.openjdk.java.net/jdk/pull/5288


Integrated: 8273092: Sort classlist in JDK image

2021-08-31 Thread Ioi Lam
On Fri, 27 Aug 2021 23:12:52 GMT, Ioi Lam  wrote:

> When the classlist is generated using build.tools.classlist.HelloClasslist, 
> its contents may be non-deterministic due to Java thread execution order.
> 
> We should sort the generated classlist to make the JDK image's contents more 
> deterministic.
> 
> Tested with Mach5 tier1, tier2, builds-tier5

This pull request has now been integrated.

Changeset: 1996f649
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/1996f649a3a30b7ac4b547a762417f807f5fa414
Stats: 114 lines in 3 files changed: 102 ins; 6 del; 6 mod

8273092: Sort classlist in JDK image

Reviewed-by: redestad, ihse, dfuchs

-

PR: https://git.openjdk.java.net/jdk/pull/5288


Re: RFR: 8273092: Sort classlist in JDK image [v2]

2021-08-30 Thread Ioi Lam
> When the classlist is generated using build.tools.classlist.HelloClasslist, 
> its contents may be non-deterministic due to Java thread execution order.
> 
> We should sort the generated classlist to make the JDK image's contents more 
> deterministic.
> 
> Tested with Mach5 tier1, tier2, builds-tier5

Ioi Lam has updated the pull request incrementally with one additional commit 
since the last revision:

  @dfuch comments

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/5288/files
  - new: https://git.openjdk.java.net/jdk/pull/5288/files/dc170ec0..ee710895

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5288&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5288&range=00-01

  Stats: 15 lines in 1 file changed: 7 ins; 7 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/5288.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/5288/head:pull/5288

PR: https://git.openjdk.java.net/jdk/pull/5288


Re: RFR: 8273092: Sort classlist in JDK image [v2]

2021-08-30 Thread Ioi Lam
On Mon, 30 Aug 2021 12:51:43 GMT, Daniel Fuchs  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   @dfuch comments
>
> make/jdk/src/classes/build/tools/classlist/SortClasslist.java line 58:
> 
>> 56: String line = scanner.nextLine();
>> 57: Matcher m = p.matcher(line);
>> 58: if (m.find()) {
> 
> Are we sure that a comment line will not match this regexp, or that if it 
> matches, it is not a comment line?

Thanks for the comments. I've swapper the matching order to check for leading 
`#` and `@` characters first.

-

PR: https://git.openjdk.java.net/jdk/pull/5288


RFR: 8273092: Sort classlist in JDK image

2021-08-27 Thread Ioi Lam
When the classlist is generated using build.tools.classlist.HelloClasslist, its 
contents may be non-deterministic due to Java thread execution order.

We should sort the generated classlist to make the JDK image's contents more 
deterministic.

Tested with Mach5 tier1, tier2, builds-tier5

-

Commit messages:
 - 8273092: Sort classlist in JDK image

Changes: https://git.openjdk.java.net/jdk/pull/5288/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5288&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8273092
  Stats: 114 lines in 3 files changed: 102 ins; 6 del; 6 mod
  Patch: https://git.openjdk.java.net/jdk/pull/5288.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/5288/head:pull/5288

PR: https://git.openjdk.java.net/jdk/pull/5288


Re: RFR: 8272600: (test) Use native "sleep" in Basic.java [v2]

2021-08-25 Thread Ioi Lam
On Wed, 25 Aug 2021 21:45:57 GMT, Roger Riggs  wrote:

>> The intermittent test in java/lang/ProcessBuilder/Basic.java has identified 
>> unexpected messages from a child Java VM
>> as the cause of the test failure.  Attempts to control the output of the 
>> child VM have failed, the VM is unrepentant .
>> 
>> There is no functionality in the child except to wait long enough for the 
>> test to finish and the child is destroyed.
>> The fix is to switch from using a Java child to using a native child; a new 
>> executable `sleepmillis`.
>
> Roger Riggs has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Revised to use a native program for sleep

LGTM

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5239


Re: RFR: 8272600: (test) Use native "sleep" in Basic.java

2021-08-24 Thread Ioi Lam
On Tue, 24 Aug 2021 19:15:56 GMT, Roger Riggs  wrote:

>> test/jdk/java/lang/ProcessBuilder/Basic.java line 2635:
>> 
>>> 2633: List childArgs = null;
>>> 2634: Path sleepExe = TEST_NATIVEPATH.resolve("sleepmillis");
>>> 2635: if (sleepExe.toFile().canExecute()) {
>> 
>> Why is the fallback necessary? Other test cases such as 
>> test/jdk/tools/launcher/JliLaunchTest.java do not have such a fallback.
>> 
>> Also, I noticed that JliLaunchTest does something like this:
>> 
>> 
>> Path launcher = Paths.get(System.getProperty("test.nativepath"),
>> "JliLaunchTest" + (Platform.isWindows() ? ".exe" : ""));
>> 
>> 
>> but test/jdk/java/lang/reflect/exeCallerAccessTest/CallerAccessTest.java 
>> doesn't add ".exe", so it may not be necessary.
>
> See above, the java Child is more portable and lower maintenance.
> Windows looks for xxx.exe if xxx is not found.

How about adding a comment to explain why the fallback is necessary?

Also, if `jtreg -nativepath:xxx` is specified, but there's no xxx/sleepmillis, 
or xxx/sleepmillis is not executable, that should be a setup error (or a bug in 
the test itself). E.g., if we are testing inside mach5, we should always 
execute the native program, and should not silently fallback to the java Child 
program. Otherwise, setup problems in mach5 might bring us back to the 
mysterious intermittent failure.

(The current version of the code is buggy on Windows and will always silently 
fall back to Child because the executable is named "sleepmillis.exe", not 
"sleepmillis").

-

PR: https://git.openjdk.java.net/jdk/pull/5239


Re: RFR: 8272600: (test) Use native "sleep" in Basic.java

2021-08-24 Thread Ioi Lam
On Tue, 24 Aug 2021 15:55:28 GMT, Roger Riggs  wrote:

> The intermittent test in java/lang/ProcessBuilder/Basic.java has identified 
> unexpected messages from a child Java VM
> as the cause of the test failure.  Attempts to control the output of the 
> child VM have failed, the VM is unrepentant .
> 
> There is no functionality in the child except to wait long enough for the 
> test to finish and the child is destroyed.
> The fix is to switch from using a Java child to using a native child; a new 
> executable `sleepmillis`.

test/jdk/java/lang/ProcessBuilder/Basic.java line 30:

> 28:  *  6464154 6523983 6206031 4960438 6631352 6631966 6850957 6850958
> 29:  *  4947220 7018606 7034570 4244896 5049299 8003488 8054494 8058464
> 30:  *  8067796 8224905 8263729 8265173 8272600 8231297

The test should also be modified to use `@run main/othervm/native/timeout=300` 
so that this test will be flagged by jtreg if `-nativepath:` is not specified.

test/jdk/java/lang/ProcessBuilder/Basic.java line 2635:

> 2633: List childArgs = null;
> 2634: Path sleepExe = TEST_NATIVEPATH.resolve("sleepmillis");
> 2635: if (sleepExe.toFile().canExecute()) {

Why is the fallback necessary? Other test cases such as 
test/jdk/tools/launcher/JliLaunchTest.java do not have such a fallback.

Also, I noticed that JliLaunchTest does something like this:


Path launcher = Paths.get(System.getProperty("test.nativepath"),
"JliLaunchTest" + (Platform.isWindows() ? ".exe" : ""));


but test/jdk/java/lang/reflect/exeCallerAccessTest/CallerAccessTest.java 
doesn't add ".exe", so it may not be necessary.

test/jdk/java/lang/ProcessBuilder/exeSleepMillis.c line 45:

> 43: sleeptime.tv_nsec = (millis % 1000) * 1000 * 1000;
> 44: int rc;
> 45: while ((rc = nanosleep(&sleeptime, &sleeptime)) > 0) {

is `nanosleep` a portable call? I couldn't find documentation for it with 
google search of `nanosleep site:docs.microsoft.com`.

-

PR: https://git.openjdk.java.net/jdk/pull/5239


Re: RFR: 8271368: [BACKOUT] JDK-8266054 VectorAPI rotate operation optimization

2021-07-27 Thread Ioi Lam
On Wed, 28 Jul 2021 05:35:59 GMT, Vladimir Kozlov  wrote:

> Backout the following changes due to vector tests failures in tier 2 and 
> later: 
> [JDK-8266054](https://bugs.openjdk.java.net/browse/JDK-8266054) VectorAPI 
> rotate operation optimization 
> 
> Changes also caused copyright header validation failure in Tier1 due to 
> missing `,` after copyright year in new test.
> 
> Currently running testing.

LGTM

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4915


Re: [jdk17] RFR: 8269840: Update Platform.isDefaultCDSArchiveSupported() to return true for Linux-aarch64

2021-07-07 Thread Ioi Lam
On Thu, 8 Jul 2021 01:59:25 GMT, Mikhailo Seledtsov  
wrote:

> Now that "JDK-8268212 Build linux-aarch64 natively" added support for default 
> CDS archive, time to update test configuration for this platform. This is a 
> very small one-line change.

Looks good, but this returns true also for osx-aarch64 and windows-aarch64, so 
the comment message needs to be updated.

-

Marked as reviewed by iklam (Reviewer).

PR: https://git.openjdk.java.net/jdk17/pull/229


Integrated: 8265605: Cannot call BootLoader::loadClassOrNull before initPhase2

2021-05-13 Thread Ioi Lam
On Wed, 12 May 2021 05:51:14 GMT, Ioi Lam  wrote:

> This bug was discovered during the development of 
> [JDK-6824466](https://bugs.openjdk.java.net/browse/JDK-6824466): when CDS is 
> enabled, if `BootLoader::loadClassOrNull` is called before `initPhase2`, it 
> would trigger the initialization of the archived module graph in the wrong 
> order. Because of unanticipated nesting of `` methods, 
> `BootLoader::SERVICES_CATALOG` becomes empty, causing future `ServiceLoader` 
> operations to fail.
> 
> The fix has 2 parts:
> 
> - `BootLoader::loadClassOrNull` no longer calls `ClassLoaders::bootLoader()`. 
> This avoids triggering the archived module graph initialization. Instead, it 
> makes a direct call to `Classloader::findBootstrapClassOrNull()`. We don't 
> actually need a `ClassLoader` instance for this call, so I changed 
> `Classloader::findBootstrapClassOrNull()` to be a static method.
> - The object returned by `BootLoader::getServicesCatalog()` is now maintained 
> inside `jdk.internal.loader.ClassLoaders`.  Although not strictly required 
> for the fix, this simplifies the initialization of the archived module graph. 
> It also makes the logic consistent for the 3 built-in loaders 
> (boot/platform/app).
> 
> Testing: tiers1-4 in progress. I also verified that the bug reported by Mandy 
> is no longer reproducible after I applied this patch on her branch.

This pull request has now been integrated.

Changeset: 1e0ecd6d
Author:Ioi Lam 
URL:   
https://git.openjdk.java.net/jdk/commit/1e0ecd6d56541c948e0d120295f5008d3248598f
Stats: 43 lines in 8 files changed: 9 ins; 13 del; 21 mod

8265605: Cannot call BootLoader::loadClassOrNull before initPhase2

Reviewed-by: alanb, mchung

-

PR: https://git.openjdk.java.net/jdk/pull/3992


Re: RFR: 8265605: Cannot call BootLoader::loadClassOrNull before initPhase2 [v3]

2021-05-13 Thread Ioi Lam
On Thu, 13 May 2021 06:19:47 GMT, Alan Bateman  wrote:

>> Ioi Lam has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   cleaned up ClassLoaders.setArchivedServicesCatalog
>
> Marked as reviewed by alanb (Reviewer).

Thanks @AlanBateman and @mlchung for the review.

-

PR: https://git.openjdk.java.net/jdk/pull/3992


Re: RFR: 8265605: Cannot call BootLoader::loadClassOrNull before initPhase2 [v4]

2021-05-13 Thread Ioi Lam
> This bug was discovered during the development of 
> [JDK-6824466](https://bugs.openjdk.java.net/browse/JDK-6824466): when CDS is 
> enabled, if `BootLoader::loadClassOrNull` is called before `initPhase2`, it 
> would trigger the initialization of the archived module graph in the wrong 
> order. Because of unanticipated nesting of `` methods, 
> `BootLoader::SERVICES_CATALOG` becomes empty, causing future `ServiceLoader` 
> operations to fail.
> 
> The fix has 2 parts:
> 
> - `BootLoader::loadClassOrNull` no longer calls `ClassLoaders::bootLoader()`. 
> This avoids triggering the archived module graph initialization. Instead, it 
> makes a direct call to `Classloader::findBootstrapClassOrNull()`. We don't 
> actually need a `ClassLoader` instance for this call, so I changed 
> `Classloader::findBootstrapClassOrNull()` to be a static method.
> - The object returned by `BootLoader::getServicesCatalog()` is now maintained 
> inside `jdk.internal.loader.ClassLoaders`.  Although not strictly required 
> for the fix, this simplifies the initialization of the archived module graph. 
> It also makes the logic consistent for the 3 built-in loaders 
> (boot/platform/app).
> 
> Testing: tiers1-4 in progress. I also verified that the bug reported by Mandy 
> is no longer reproducible after I applied this patch on her branch.

Ioi Lam has updated the pull request with a new target base due to a merge or a 
rebase. The incremental webrev excludes the unrelated changes brought in by the 
merge/rebase. The pull request contains four additional commits since the last 
revision:

 - Merge branch 'master' into 
8265605-bootloader-loadClassOrNull-before-init-phase2
 - cleaned up ClassLoaders.setArchivedServicesCatalog
 - @AlanBateman comments
 - 8265605: Cannot call BootLoader::loadClassOrNull before initPhase2

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/3992/files
  - new: https://git.openjdk.java.net/jdk/pull/3992/files/4cd981c0..a3dc9427

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3992&range=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3992&range=02-03

  Stats: 9450 lines in 343 files changed: 2601 ins; 4164 del; 2685 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3992.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3992/head:pull/3992

PR: https://git.openjdk.java.net/jdk/pull/3992


  1   2   3   >