Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v15]

2025-01-14 Thread Severin Gehwolf
On Fri, 10 Jan 2025 14:57:09 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been added in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 41 commits:
> 
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Fix missing imports
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Add exclusive access dirs directive for systemd tests
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - ... and 31 more: https://git.openjdk.org/jdk/compare/1f457977...c83c22eb

Keep open, bot.

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2590507466


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v15]

2025-01-10 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 41 commits:

 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Fix missing imports
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - ... and 31 more: https://git.openjdk.org/jdk/compare/1f457977...c83c22eb

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=14
  Stats: 1621 lines in 27 files changed: 1373 ins; 152 del; 96 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v13]

2024-12-17 Thread Severin Gehwolf
On Tue, 19 Nov 2024 13:37:25 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been added in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 38 commits:
> 
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Add exclusive access dirs directive for systemd tests
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Improve systemd slice test for non-root on cg v2
>  - Fix unit tests
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - ... and 28 more: https://git.openjdk.org/jdk/compare/23597361...b56fc7b7

Ping? Still looking for a review of this, thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2548694574


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v14]

2024-12-17 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 40 commits:

 - Fix missing imports
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - ... and 30 more: https://git.openjdk.org/jdk/compare/ff85865b...ceffcfc6

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=13
  Stats: 1621 lines in 27 files changed: 1373 ins; 152 del; 96 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v9]

2024-12-11 Thread Severin Gehwolf
On Wed, 11 Dec 2024 15:19:06 GMT, Sergey Chernyshev  
wrote:

>> src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 322:
>> 
>>> 320: } else {
>>> 321:   log_warning(os, container)("Cgroup cpu/memory controller path 
>>> includes '../', detected limits won't be accurate");
>>> 322: }
>> 
>> Please move this warning to `CgroupUtil::adjust_controller` and abort the 
>> adjustment, we don't need to issue this warning multiple times, and we'd not 
>> be able to adjust it to a path that will work. Showing the warning once 
>> should be sufficient. We shouldn't see this path in any non-moved scenarios. 
>> It would perhaps help if we included some detail why this warning is being 
>> shown. I suggest:
>> 
>> ```cgroup controller path seems to have moved (includes '.../'), detected 
>> limits won't be accurate```
>
> Would you recommand also to include the paths in that warning? Something like
> ```cgroup controller path at '/sys/fs/cgroup' seems to have moved to 
> '../../test', detected limits won't be accurate```
> This way it will have all the necessary information to investigate customer 
> cases.

Seems fine yes.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1880407169


Re: RFR: 8345684: OperatingSystemMXBean.getSystemCpuLoad() throws NPE [v3]

2024-12-09 Thread Severin Gehwolf
On Mon, 9 Dec 2024 13:00:20 GMT, Severin Gehwolf  wrote:

> Since this is my second merged PR to the OpenJDK, could I become one?

If you want to be, the process for becoming OpenJDK author is described here:
https://openjdk.org/projects/#project-author

The relevant project is `jdk`:
https://openjdk.org/census#jdk

-

PR Comment: https://git.openjdk.org/jdk/pull/22611#issuecomment-2527868999


Re: RFR: 8345684: OperatingSystemMXBean.getSystemCpuLoad() throws NPE [v3]

2024-12-09 Thread Severin Gehwolf
On Mon, 9 Dec 2024 12:27:21 GMT, Severin Gehwolf  wrote:

>> @kevinjwalls Thanks! Apologies in advance if this is not the right process, 
>> but could this be cherry-picked into JDK 21?
>
>> Apologies in advance if this is not the right process, but could this be 
>> cherry-picked into JDK 21?
> 
> Please see 
> https://github.com/openjdk/jdk21u-dev?tab=readme-ov-file#welcome-to-openjdk-21-updates
>  for the process.

> @jerboaa It looks like I would need to be at least an _Author_ to follow 
> these instructions. Since this is my second merged PR to the OpenJDK, could I 
> become one?

For the skara bot command, yes. For manual contribution, no. See 
https://wiki.openjdk.org/display/JDKUpdates/How+to+contribute+or+backport+a+fix

-

PR Comment: https://git.openjdk.org/jdk/pull/22611#issuecomment-2527863019


Re: RFR: 8345684: OperatingSystemMXBean.getSystemCpuLoad() throws NPE [v3]

2024-12-09 Thread Severin Gehwolf
On Mon, 9 Dec 2024 12:14:32 GMT, Fabian Meumertzheim  wrote:

> Apologies in advance if this is not the right process, but could this be 
> cherry-picked into JDK 21?

Please see 
https://github.com/openjdk/jdk21u-dev?tab=readme-ov-file#welcome-to-openjdk-21-updates
 for the process.

-

PR Comment: https://git.openjdk.org/jdk/pull/22611#issuecomment-2527790319


Re: RFR: 8345684: OperatingSystemMXBean.getSystemCpuLoad() throws NPE [v3]

2024-12-09 Thread Severin Gehwolf
On Mon, 9 Dec 2024 10:07:14 GMT, Fabian Meumertzheim  wrote:

>> The return value of Metrics#getCpuSetCpus may change over time, including 
>> from non-null to null across the two calls in this method.
>
> Fabian Meumertzheim has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Empty commit to trigger CI checks

LGTM

-

Marked as reviewed by sgehwolf (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22611#pullrequestreview-2488428790


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v9]

2024-12-06 Thread Severin Gehwolf
On Thu, 5 Dec 2024 20:05:56 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request with a new target base due to 
> a merge or a rebase. The pull request now contains 13 commits:
> 
>  - diverged after integration of JDK-8344177
>
># Conflicts:
>#  src/java.base/linux/classes/jdk/internal/platform/CgroupUtil.java
>  - update cgroup v1 in metrics
>  - Apply suggestions from code review
>
>Co-authored-by: Severin Gehwolf 
>  - updated test (path is reduced)
>  - updated test (path is reduced)
>  - adjust path suffix in cgroup (v1) version specific code, when root != 
> cgroup
>  - Merge branch 'master' into JDK-8343191
>  - warn wenn ../ encountered, update path adjustment
>  - Update src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp
>
>Co-authored-by: Severin Gehwolf 
>  - Merge branch 'master' into JDK-8343191
>  - ... and 3 more: https://git.openjdk.org/jdk/compare/bd6d911c...2a7e9d82

FYI: I'll try to test and review this more thoroughly next week.

src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 322:

> 320: } else {
> 321:   log_warning(os, container)("Cgroup cpu/memory controller path 
> includes '../', detected limits won't be accurate");
> 322: }

Please move this warning to `CgroupUtil::adjust_controller` and abort the 
adjustment, we don't need to issue this warning multiple times, and we'd not be 
able to adjust it to a path that will work. Showing the warning once should be 
sufficient. We shouldn't see this path in any non-moved scenarios. It would 
perhaps help if we included some detail why this warning is being shown. I 
suggest:

```cgroup controller path seems to have moved (includes '.../'), detected 
limits won't be accurate```

-

PR Review: https://git.openjdk.org/jdk/pull/21808#pullrequestreview-2484225572
PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1872956609


Re: RFR: 8345286: Remove use of SecurityManager API from misc areas [v8]

2024-12-02 Thread Severin Gehwolf
On Mon, 2 Dec 2024 16:11:57 GMT, Jaikiran Pai  wrote:

>> Can I please get a review of this change which removes usages of 
>> SecurityManager related APIs and some leftover related to SecurityManager 
>> changes?
>> 
>> This addresses https://bugs.openjdk.org/browse/JDK-8345286. Most of these 
>> changes are trivial. The 
>> `src/java.base/linux/classes/jdk/internal/platform/CgroupUtil.java` used to 
>> expose utility methods for dealing with SecurityManager permissions and it 
>> was called from a few places. That class is no longer needed with the clean 
>> up done in this PR.
>> 
>> No new tests have been introduced and tier testing is currently in progress.
>
> Jaikiran Pai has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 13 commits:
> 
>  - merge latest from master branch
>  - remove changes to 
> src/java.base/unix/classes/sun/security/provider/NativePRNG.java
>  - remove unused import
>  - replace remaining Paths.get() with Path.of() in the updated files
>  - Update 
> src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
>
>Co-authored-by: Severin Gehwolf 
>  - Update 
> src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java
>
>Co-authored-by: Severin Gehwolf 
>  - Path.of() instead of Paths.get() in CgroupV2Subsystem.java
>  - remove unnecessary space
>  - Path.of() instead of Paths.get()
>  - fix formatting of try-with-resources in CgroupSubsystemController.java
>  - ... and 3 more: https://git.openjdk.org/jdk/compare/30b8bbe2...9baa279b

cgroup changes still look good.

-

Marked as reviewed by sgehwolf (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22478#pullrequestreview-2473306738


Re: RFR: 8345286: Remove use of SecurityManager API from misc areas [v5]

2024-12-02 Thread Severin Gehwolf
On Mon, 2 Dec 2024 14:16:02 GMT, Jaikiran Pai  wrote:

>> Can I please get a review of this change which removes usages of 
>> SecurityManager related APIs and some leftover related to SecurityManager 
>> changes?
>> 
>> This addresses https://bugs.openjdk.org/browse/JDK-8345286. Most of these 
>> changes are trivial. The 
>> `src/java.base/linux/classes/jdk/internal/platform/CgroupUtil.java` used to 
>> expose utility methods for dealing with SecurityManager permissions and it 
>> was called from a few places. That class is no longer needed with the clean 
>> up done in this PR.
>> 
>> No new tests have been introduced and tier testing is currently in progress.
>
> Jaikiran Pai has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Update 
> src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
>   
>   Co-authored-by: Severin Gehwolf 

cgroups changes look good. Haven't looked at the other bits.

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
 line 29:

> 27: package jdk.internal.platform;
> 28: 
> 29: import java.io.BufferedReader;

Unused now?

-

Marked as reviewed by sgehwolf (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22478#pullrequestreview-2472822295
PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865928383


Re: RFR: 8345286: Remove use of SecurityManager API from misc areas [v2]

2024-12-02 Thread Severin Gehwolf
On Mon, 2 Dec 2024 14:03:10 GMT, Jaikiran Pai  wrote:

>> src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
>>  line 70:
>> 
>>> 68: String line = bufferedReader.readLine();
>>> 69: return line;
>>> 70: } catch (IOException e) {
>> 
>> We can use `Files.lines` here right away:
>> 
>> Suggestion:
>> 
>> try (Stream lines = Files.lines(filePath)) {
>> Optional firstLine = lines.findFirst();
>> return firstLine.orElse(null);
>> } catch (UncheckedIOException | IOException e) {
>
> Is the catch for `UncheckedIOException` due to some previously known failure 
> resulting in the use of these APIs?

Without using `Files.lines()` no `UncheckedIOException` would be thrown. Just 
`IOException` and `null` returned. This extra catch is there to avoid new 
`UncheckedIOException` being thrown on `findFirst()`. I.e. to keep semantics 
the same as before.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865913984


Re: RFR: 8345286: Remove use of SecurityManager API from misc areas [v2]

2024-12-02 Thread Severin Gehwolf
On Mon, 2 Dec 2024 12:43:20 GMT, Alan Bateman  wrote:

>> Jaikiran Pai has updated the pull request incrementally with four additional 
>> commits since the last revision:
>> 
>>  - remove unnecessary space
>>  - Path.of() instead of Paths.get()
>>  - fix formatting of try-with-resources in CgroupSubsystemController.java
>>  - leave out MethodUtil from the clean up
>
> src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
>  line 167:
> 
>> 165: if (controller == null) return defaultRetval;
>> 166: 
>> 167: try (Stream lines = 
>> Files.lines(Paths.get(controller.path(), param))) {
> 
> Using Path.of might be clearer here.

What Alan said.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865861779


Re: RFR: 8345286: Remove use of SecurityManager API from misc areas [v2]

2024-12-02 Thread Severin Gehwolf
On Mon, 2 Dec 2024 13:39:57 GMT, Jaikiran Pai  wrote:

>> Can I please get a review of this change which removes usages of 
>> SecurityManager related APIs and some leftover related to SecurityManager 
>> changes?
>> 
>> This addresses https://bugs.openjdk.org/browse/JDK-8345286. Most of these 
>> changes are trivial. The 
>> `src/java.base/linux/classes/jdk/internal/platform/CgroupUtil.java` used to 
>> expose utility methods for dealing with SecurityManager permissions and it 
>> was called from a few places. That class is no longer needed with the clean 
>> up done in this PR.
>> 
>> No new tests have been introduced and tier testing is currently in progress.
>
> Jaikiran Pai has updated the pull request incrementally with four additional 
> commits since the last revision:
> 
>  - remove unnecessary space
>  - Path.of() instead of Paths.get()
>  - fix formatting of try-with-resources in CgroupSubsystemController.java
>  - leave out MethodUtil from the clean up

I've reviewed only the cgroups parts.

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
 line 70:

> 68: String line = bufferedReader.readLine();
> 69: return line;
> 70: } catch (IOException e) {

We can use `Files.lines` here right away:

Suggestion:

try (Stream lines = Files.lines(filePath)) {
Optional firstLine = lines.findFirst();
return firstLine.orElse(null);
} catch (UncheckedIOException | IOException e) {

src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java
 line 71:

> 69: String line = bufferedReader.readLine();
> 70: return line;
> 71: } catch (IOException e) {

I think we can do this refactoring now:

Suggestion:

try (Stream lines =
 Files.lines(Paths.of(controller.path(), param))) {
Optional firstLine = lines.findFirst();
return firstLine.orElse(null);
} catch (IOException e) {

src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java
 line 332:

> 330: 
> 331: private long sumTokensIOStat(Function mapFunc) {
> 332: try (Stream lines = 
> Files.lines(Paths.get(unified.path(), "io.stat"))) {

Suggestion:

try (Stream lines = Files.lines(Paths.of(unified.path(), 
"io.stat"))) {

src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java
 line 334:

> 332: try (Stream lines = 
> Files.lines(Paths.get(unified.path(), "io.stat"))) {
> 333: return lines.map(mapFunc)
> 334: .collect(Collectors.summingLong(e -> e));

Suggestion:

return lines.map(mapFunc)
.collect(Collectors.summingLong(e -> e));

-

PR Review: https://git.openjdk.org/jdk/pull/22478#pullrequestreview-2472643767
PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865876078
PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865860841
PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865815063
PR Review Comment: https://git.openjdk.org/jdk/pull/22478#discussion_r1865865085


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-12-02 Thread Severin Gehwolf
On Sun, 1 Dec 2024 12:48:40 GMT, Sergey Chernyshev  
wrote:

> In the Cloudflare case (cg v1 before patch), the path 
> `/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c`
>  will be adjusted as follows:

I assume that's the adjustment logic (pre-patch) that happens provided the 
setting to `nullptr` issue is addressed in cg v1?

> ```
> /sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
> /sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/
> /sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/
> /sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/
> /sys/fs/cgroup/cpu,cpuacct/system.slice/
> /sys/fs/cgroup/cpu,cpuacct/
> ```
> 
> In cg v1 systems, all the above paths except for the very last, are invalid. 
> So, no chance for detecting any lower limits. When a process moved to a 
> subgroup, the trailing part of cgroup_path will constitute the subgroup path.

And after the patch this would become this, right?


/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
/sys/fs/cgroup/cpu,cpuacct/


If so, the change of logic isn't that different (except for where the existence 
checks are happening). This path adjustment only happens at boot, so the next 
look-up would use `/sys/fs/cgroup/cpu,cpuacct/`. So it's a trade-off, special 
case this use-case for added complexity or not.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1865630320


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-29 Thread Severin Gehwolf
On Fri, 29 Nov 2024 14:47:13 GMT, Sergey Chernyshev  
wrote:

> > Right. I'm still not convinced this extra reduction buys us much. The 
> > adjust controller logic will handle it if kept as is in the Metrics version.
> 
> The adjust controller logic won't handle it, because it reduces the path from 
> child to parent. It's goal is to locate the smallest limit.

It's goal is to find the path for a specific controller that has the lowest 
limit and keeps that path for that controller for the rest of the run-time of 
the JVM. Consider a case where this reduction currently happens in 
`set_subsystem_path()`. Instead of the manual work of setting it to a path 
where the directory actually exists we set it to `/`.

In the cloudflare case we'd end up with a subsystem path of 
`/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c`.
 Since the cgroup_path != _root we trigger path adjustment increasing the 
chance to detect any lower limit in any of the paths down to the mount point. 
By doing so **and** there is a lower limit in the hierarchy we know the path 
exists as well and that is being used from then on.

So if  `/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden` existed 
and has a limit it would be used. I thought this is the reason why cg v2 didn't 
need the same fix when `--cgroupns=host` is used.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1863797758


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v6]

2024-11-29 Thread Severin Gehwolf
On Wed, 27 Nov 2024 09:11:22 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   adjust path suffix in cgroup (v1) version specific code, when root != cgroup

test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java line 481:

> 479: assertNotNull("Controller path should not have been null", 
> actualPath);
> 480: String expectedPath = expectedMountPoint + 
> cpuInfo.getCgroupPath();
> 481: assertEquals("Should be equal to the mount point path", 
> expectedPath, actualPath);

Suggestion:

assertEquals("Should be equal to the mount point with cg path 
appended", expectedPath, actualPath);

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1860461756


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-29 Thread Severin Gehwolf
On Wed, 27 Nov 2024 10:56:57 GMT, Sergey Chernyshev  
wrote:

>>> Version specific code can be had in `set_subsystem_path()` of the 
>>> corresponding impl (like an earlier version of your patch). `lowest_limit` 
>>> and `limit_cg_path` fixes are version agnostic and can and should be fixed 
>>> in `CgroupUtil::adjust_controller`.
>> 
>> Done. Please see the updated PR.
>
>> Done. Please see the updated PR.
> 
> The metrics part still needs the update - in the cgroup version specific 
> `setPath()`.

Right. I'm still not convinced this extra reduction buys us much. The adjust 
controller logic will handle it if kept as is in the Metrics version.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1860465637


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v13]

2024-11-28 Thread Severin Gehwolf
On Tue, 19 Nov 2024 13:37:25 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been added in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 38 commits:
> 
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Add exclusive access dirs directive for systemd tests
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Improve systemd slice test for non-root on cg v2
>  - Fix unit tests
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - ... and 28 more: https://git.openjdk.org/jdk/compare/23597361...b56fc7b7

Keep open bot.

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2506124479


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v6]

2024-11-27 Thread Severin Gehwolf
On Wed, 27 Nov 2024 09:11:22 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   adjust path suffix in cgroup (v1) version specific code, when root != cgroup

src/hotspot/os/linux/cgroupUtil_linux.cpp line 70:

> 68: os::free(limit_cg_path); // handles nullptr
> 69: limit_cg_path = os::strdup(cg_path);
> 70:   }

We can avoid the duplicate copy of the original cgroup path, which is already 
captured in `orig` by using:


jlong lowest_limit = limit < 0 ? phys_mem : limit;
julong orig_limit = ((julong)lowest_limit) != phys_mem ? lowest_limit : 
phys_mem;


And on line 91 we change the condition from:


if ((julong)lowest_limit != phys_mem) {


to:


if ((julong)lowest_limit != orig_limit) {

src/hotspot/os/linux/cgroupUtil_linux.cpp line 129:

> 127: lowest_limit = cpus;
> 128: os::free(limit_cg_path); // handles nullptr
> 129: limit_cg_path = os::strdup(cg_path);

Same here with the extra allocation of `cg_path`;

src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 320:

> 318: ss.print_raw(cgroup_path);
> 319: if (strstr((char*)cgroup_path, "../") != nullptr) {
> 320:   log_warning(os, container)("Cgroup v2 path at [%s] is [%s], cgroup 
> limits can be wrong.",

Suggestion:

  log_warning(os, container)("Cgroup cpu/memory controller path includes 
'../', detected limits won't be accurate",

src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 322:

> 320:   log_warning(os, container)("Cgroup v2 path at [%s] is [%s], cgroup 
> limits can be wrong.",
> 321: mount_path, cgroup_path);
> 322: }

Why the cast to `char*`?

We should probably move this warning to `CgroupUtil::adjust_controller`, right 
before we've determined that we actually need to adjust. I wonder, though, if 
we should just print the warning and set the cgroup_path to `/` and return 
early. Otherwise, path adjustment will run with no different result.

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java
 line 66:

> 64: // Rely on path adjustment that determines the 
> actual suffix.
> 65: path += cgroupPath;
> 66: }

This seems a simpler solution than the hotspot one. While I prefer this one, 
please make the

Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v5]

2024-11-25 Thread Severin Gehwolf
On Fri, 22 Nov 2024 15:08:18 GMT, Sergey Chernyshev  
wrote:

> > One thing to note is that the new test requires root privileges (AFAIK). We 
> > should skip the test if we are being run as root.
> 
> The test works just like other docker tests, root privileges are not 
> required. In systems that can't run docker for any reason, it should be 
> skipped.

The new tests directly write into the cgroup interface files (does away with 
relying on the container framework doing it). For example:


"echo '+memory' > /sys/fs/cgroup/memory/cgroup.subtree_control ; "


... relies on the user in the container image to be `root`. So depending on 
which base image is being used - by means of `-Djdk.test.docker.image.name=XXX 
-Djdk.test.docker.image.version=XX` - it might fail unexpectedly with 
permission errors.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2497440576


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-25 Thread Severin Gehwolf
On Sat, 23 Nov 2024 10:28:42 GMT, Sergey Chernyshev  
wrote:

>>> The added code in `CgroupUtil::adjust_controller` runs for cg v1 and cg v2 
>>> when path adjustment is deemed needed. So I'm not clear why it's needed for 
>>> cg v2
>> 
>> It looks like there's no way to see at this point, if we are in cgroup v1 or 
>> v2 - if I am not mistaken.
>> 
>>> I thought we have established that for cg v2 && `--cgroupns=host` the 
>>> current path adjustment is sufficient? What am I missing?
>> 
>> The current path adjustment still needs correction of `lowest_limit` in cg 
>> v2. Also allocating `limit_cg_path` is important, that's why I added the if 
>> block.
>
>> It looks like there's no way to see at this point, if we are in cgroup v1 or 
>> v2 - if I am not mistaken.
> 
> On the other hand, a type parameter can be added to 
> `CgroupUtil::adjust_controller()`. Would you recommend doing so?

@sercher I wouldn't recommend adding a type parameter. 
`CgroupUtil::adjust_controller` should stay cgroup version agnostic. Version 
specific code can be had in `set_subsystem_path()` of the corresponding impl 
(like an earlier version of your patch). `lowest_limit` and `limit_cg_path` 
fixes are version agnostic and can and should be fixed in 
`CgroupUtil::adjust_controller`.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1856241432


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-22 Thread Severin Gehwolf
On Fri, 22 Nov 2024 13:00:14 GMT, Sergey Chernyshev  
wrote:

>>> Here, `limit` at line 64 is not stored as a possible lowest limit, so if 
>>> the inner group has lower limit than the outer group, it won't be detected 
>>> (cg v2 is affected too).
>> 
>> Good spot! How about this to fix it?
>> 
>> 
>> jlong limit = mem->read_memory_limit_in_bytes(phys_mem); 
>> jlong lowest_limit = limit < 0 ? phys_mem: limit;
>> 
>> 
>>> The cgroup path will be adjusted to the outer group (when it's limited). 
>>> Another issue is in the loop at line 66 that reduces the path. In cg v1 in 
>>> `cgroupns=host` mode (default) the cgroup_path includes the base cgroup and 
>>> the subgroup (when moved). It may be not quite obvious which part of the 
>>> path is the actual subgroup (and the CloudFoundry case has demonstrated 
>>> it). It is suggested to determine the actual subgroup (path suffix) before 
>>> reducing the group path. Please see the updated patch.
>> 
>> I'm worried about the added complexity. 1.) Is this something that's needed 
>> in cg v2 too? If no, we are changing cg version agnostic code for a version 
>> specific issue. 2.) Wouldn't setting the cgroup path to 
>> `/` achieve the same thing, when it's currently 
>> `null` (in current master)?
>> 
>> After all, those are corner cases which don't seem very common.
>
>> Good spot! How about this to fix it?
>> 
>> ```
>> jlong limit = mem->read_memory_limit_in_bytes(phys_mem); 
>> jlong lowest_limit = limit < 0 ? phys_mem: limit;
>> ```
> 
> Make sense to me.
> 
>> I'm worried about the added complexity. 1.) Is this something that's needed 
>> in cg v2 too? If no, we are changing cg version agnostic code for a version 
>> specific issue. 2.) Wouldn't setting the cgroup path to 
>> `/` achieve the same thing, when it's currently 
>> `null` (in current master)?
> 
> The added complexity only kicks in in cg v1 when `_root != cgroup_path`, so 
> exactly in that corner case. In cg v2 it only works with `--cgroupns=host`, 
> it ends up calling stat() once and then continues normally, so the added 
> overhead is minimal.

The added code in `CgroupUtil::adjust_controller` runs for cg v1 and cg v2 when 
path adjustment is deemed needed. So I'm not clear why it's needed for cg v2. I 
thought we have established that for cg v2 && `--cgroupns=host` the current 
path adjustment is sufficient? What am I missing?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1854277359


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-22 Thread Severin Gehwolf
On Fri, 22 Nov 2024 09:54:39 GMT, Sergey Chernyshev  
wrote:

> Here, `limit` at line 64 is not stored as a possible lowest limit, so if the 
> inner group has lower limit than the outer group, it won't be detected (cg v2 
> is affected too).

Good spot! How about this to fix it?


jlong limit = mem->read_memory_limit_in_bytes(phys_mem); 
jlong lowest_limit = limit < 0 ? phys_mem: limit;


> The cgroup path will be adjusted to the outer group (when it's limited). 
> Another issue is in the loop at line 66 that reduces the path. In cg v1 in 
> `cgroupns=host` mode (default) the cgroup_path includes the base cgroup and 
> the subgroup (when moved). It may be not quite obvious which part of the path 
> is the actual subgroup (and the CloudFoundry case has demonstrated it). It is 
> suggested to determine the actual subgroup (path suffix) before reducing the 
> group path. Please see the updated patch.

I'm worried about the added complexity. 1.) Is this something that's needed in 
cg v2 too? If no, we are changing cg version agnostic code for a version 
specific issue. 2.) Wouldn't setting the cgroup path to 
`/` achieve the same thing, when it's currently 
`null` (in current master)?

After all, those are corner cases which don't seem very common.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1853729062


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v5]

2024-11-22 Thread Severin Gehwolf
On Fri, 22 Nov 2024 09:57:44 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains seven additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8343191
>  - warn wenn ../ encountered, update path adjustment
>  - Update src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp
>
>Co-authored-by: Severin Gehwolf 
>  - Merge branch 'master' into JDK-8343191
>  - patch reimplemented
>  - fix the logic that skips duplicate controller's mount points
>  - 8343191: Cgroup v1 subsystem fails to set subsystem path

I like the added testing. One thing to note is that the new test requires root 
privileges (AFAIK). We should skip the test if we are being run as root.

The added complexity is a concern. I hope we can reduce it a bit (on the 
expense of a longer hierarchy walk).

-

PR Review: https://git.openjdk.org/jdk/pull/21808#pullrequestreview-2454136634


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v13]

2024-11-19 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 38 commits:

 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests
 - Add JVM_HostActiveProcessorCount using JVM api
 - ... and 28 more: https://git.openjdk.org/jdk/compare/23597361...b56fc7b7

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=12
  Stats: 1595 lines in 27 files changed: 1344 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-12 Thread Severin Gehwolf
On Tue, 12 Nov 2024 14:59:41 GMT, Sergey Chernyshev  
wrote:

> > The JBS issue doesn't mention `NullPointerException`. It would be good to 
> > list the observed NPE issue.
> 
> Example for NPE:
> 
> ```
> public class Test {
> public static void main(String[] args) {
> java.lang.management.ManagementFactory.getPlatformMBeanServer();
> System.out.println("PASSED.");
> }
> }
> ```
> 
> Script (cg v1):
> 
> ```
> sudo docker run --tty=true --rm --volume=$JAVA_HOME:/jdk 
> --volume=./classes:/classes:ro --memory 400m ubuntu:latest \
> sh -c "sleep 10 ; /jdk/bin/java -cp .:/classes Test" &
> sleep 10;
> HOSTPID=$(sudo ps -ef | awk '/jdk/bin/java/ && !/docker/ && !/awk/ { print $2 
> }')
> echo $HOSTPID | sudo tee /sys/fs/cgroup/memory/test/cgroup.procs > /dev/null
> sleep 10
> ```
> 
> Result (cg v1 before patch):
> 
> ```
> Exception in thread "main" java.lang.NullPointerException
>   at java.base/java.util.Objects.requireNonNull(Objects.java:220)
>   at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:296)
>   at java.base/java.nio.file.Path.of(Path.java:148)
>   at java.base/java.nio.file.Paths.get(Paths.java:69)
>   at 
> java.base/jdk.internal.platform.CgroupUtil.lambda$readStringValue$0(CgroupUtil.java:67)
>   at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:571)
>   at 
> java.base/jdk.internal.platform.CgroupUtil.readStringValue(CgroupUtil.java:69)
>   at 
> java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:65)
>   at 
> java.base/jdk.internal.platform.CgroupSubsystemController.getLongValue(CgroupSubsystemController.java:124)
>   at 
> java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getLongValue(CgroupV1Subsystem.java:190)
>   at 
> java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getHierarchical(CgroupV1Subsystem.java:160)
>   at 
> java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.initSubSystem(CgroupV1Subsystem.java:85)
>   at 
> java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getInstance(CgroupV1Subsystem.java:61)
>   at 
> java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:119)
>   at 
> java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:89)
>   at 
> java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:198)
>   at 
> java.base/jdk.internal.platform.SystemMetrics.instance(SystemMetrics.java:29)
>   at 
> java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:58)
>   at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
>   at 
> jdk.management/com.sun.management.internal.OperatingSystemImpl.(OperatingSystemImpl.java:175)
>   at 
> jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean(PlatformMBeanProviderImpl.java:316)
>   at 
> jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl$4.nameToMBeanMap(PlatformMBeanProviderImpl.java:235)
>   at 
> java.management/java.lang.management.ManagementFactory.lambda$getPlatformMBeanServer$0(ManagementFactory.java:489)
>   at 
> java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:289)
>   at 
> java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:197)
>   at 
> java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1788)
>   at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
>   at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
>   at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:153)
>   at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:176)
>   at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
>   at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:636)
>   at 
> java.management/java.lang.management.ManagementFactory.getPlatformMBeanServer(ManagementFactory.java:490)
>   at Test.main(Test.java:3)
> ```

Thanks!

> After patch:
> 
> ```
> [0.001s][warning][os,container] Cgroup v1 controller (/sys/fs/cgroup/memory) 
> mounting root 
> [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] 
> doesn't match cgroup [/test]
> [0.001s][warning][os,container] Cgroup v1 controller (/sys/fs/cgroup/memory) 
> mounting root 
> [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] 
> doesn't match cgroup [/]
> [0.001s][warning][os,container] Cgroup v1 controller (/sys/fs/cgroup/memory) 
> mounting root 
> [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] 
> doesn't match cgroup [/]
> Nov 12, 2024 1:14:11 PM 
> jdk.int

Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-12 Thread Severin Gehwolf
On Thu, 7 Nov 2024 22:31:21 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8343191
>  - patch reimplemented
>  - fix the logic that skips duplicate controller's mount points
>  - 8343191: Cgroup v1 subsystem fails to set subsystem path

We really need to consider adding container tests for this use-case. Perhaps we 
can require root perms for it. The tricky part would be adding appropriate 
synchronization for the cgroup move of the shell process and subsequent `java` 
runs.

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 54:

> 52: ss.print_raw(_mount_point);
> 53: if (strcmp(_root, "/") == 0) {
> 54:   // host processes / containers w/private cgroup namespace

Suggestion:

  // host processes and containers with cgroupns=private

src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 62:

> 60:   // containers only, warn if doesn't match
> 61:   log_warning(os, container)("Cgroup v1 controller (%s) mounting root 
> [%s] doesn't match cgroup [%s]",
> 62: _mount_point, _root, cgroup_path);

Why this warning?

It appears it would make more sense to produce this warning when `cgroup_path` 
contains `../` and falling back to the `mount_path` for the subsystem path 
which indicates we have a `cgroupns=private` deployment on CG v1, but would 
likely get away with it since `memory.limit_in_bytes` will be present at the 
mount root.

If `cgroup_path` doesn't contain `../` we should append the `cgroup_path` to 
the `_mount_point` similar to what we do for cg v2. In the cloudflare case we'd 
end up with a subsystem path of 
`/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c`.
 Since the `cgroup_path != _root` we trigger path adjustment increasing the 
chance to detect any lower limit in any of the paths down to the mount point.

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java
 line 46:

> 44: }
> 45: 
> 46: public void setPath(String cgroupPath) {

This should behave the same as Hotspot and also append the cgroup path to th

Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-12 Thread Severin Gehwolf
On Tue, 12 Nov 2024 19:09:54 GMT, Sergey Chernyshev  
wrote:

> > Edit: Yet, cg v2 will get into trouble since there, for example on rootless 
> > podman on cg v2 you'd end up with this instead:
> > ```
> > [0.008s][trace][os,container] OSContainer::init: Initializing Container 
> > Support
> > [0.008s][debug][os,container] Detected optional pids controller entry in 
> > /proc/cgroups
> > [0.008s][debug][os,container] Detected cgroups v2 unified hierarchy
> > [0.008s][trace][os,container] Adjusting controller path for memory: 
> > /sys/fs/cgroup/../../../../../../test
> > [0.008s][trace][os,container] Path to /memory.max is 
> > /sys/fs/cgroup/../../../../../../test/memory.max
> > [0.008s][debug][os,container] Open of file 
> > /sys/fs/cgroup/../../../../../../test/memory.max failed, No such file or 
> > directory
> > [0.008s][trace][os,container] Memory Limit failed: -2
> > ...
> > [0.009s][trace][os,container] Path to /memory.max is 
> > /sys/fs/cgroup/memory.max
> > [0.009s][debug][os,container] Open of file /sys/fs/cgroup/memory.max 
> > failed, No such file or directory
> > [0.009s][trace][os,container] Memory Limit failed: -2
> > [0.009s][trace][os,container] Memory Limit is: -2
> > ```
> 
> Here, the path `/sys/fs/cgroup/memory.max` would normally point to the actual 
> memory limit inside the container. In this particular case, the directory 
> `/sys/fs/cgroup` is empty.

Yes. I believe we need to issue a `warning` log message if the `cgroup_path` 
includes `../` mentioning that memory limits might be wrong. There is nothing 
more we can do here.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2471414814


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-12 Thread Severin Gehwolf
On Mon, 11 Nov 2024 18:28:11 GMT, Sergey Chernyshev  
wrote:

> > So on cg v1 you start out and end with a `subsystem_path() == null` and on 
> > cg v2 you start out and end with a `subsystem_path() == 
> > /../../../../../../test`. In both cases the memory limit of 400m won't be 
> > detected.
> 
> Only when docker fails to mount the cgroup while moving process to an outer 
> group or a sibling group. It's probably not the case with CloudFoundry.

The bug suggests it's a cg v1 only problem, but I'm able to reproduce in cg v2 
too. We should handle both cases more gracefully.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2470033800


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-11 Thread Severin Gehwolf
On Thu, 7 Nov 2024 22:31:21 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8343191
>  - patch reimplemented
>  - fix the logic that skips duplicate controller's mount points
>  - 8343191: Cgroup v1 subsystem fails to set subsystem path

So on cg v1 you start out and end with a `subsystem_path() == null` and on cg 
v2 you start out and end with a `subsystem_path() == /../../../../../../test`. 
In both cases the memory limit of 400m won't be detected.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2468488527


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path

2024-11-11 Thread Severin Gehwolf
On Mon, 11 Nov 2024 10:20:02 GMT, Severin Gehwolf  wrote:

> > In the above script, a containerized process (/bin/sh) is moved to cgroup 
> > /test before /jdk/bin/java gets executed. Java inherits cgroup /test from 
> > its parent process, its _root will be /docker/, cgroup_path 
> > will be /test.
>
> OK, but why is https://bugs.openjdk.org/browse/JDK-8322420 not in effect in 
> such a case?

Answering my own question. Because the `set_subsystem_path()` function for cg 
v1 in this unusual setup returns `null`.


[0.001s][trace][os,container] OSContainer::init: Initializing Container Support
[0.001s][debug][os,container] Detected optional pids controller entry in 
/proc/cgroups
[0.002s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, 
using cgroups v1 controllers
[0.002s][trace][os,container] Adjusting controller path for memory: (null)
[0.002s][debug][os,container] read_string: subsystem path is null
[0.002s][trace][os,container] Memory Limit failed: -2
[0.002s][debug][os,container] read_string: subsystem path is null
[0.002s][trace][os,container] Memory Limit failed: -2
[0.002s][trace][os,container] No lower limit found for memory in hierarchy 
/sys/fs/cgroup/memory, adjusting to original path /test
[0.002s][debug][os,container] OSContainer::init: is_containerized() = true 
because all controllers are mounted read-only (container case)
[0.003s][trace][os,container] Path to /cpu.cfs_quota_us is 
/sys/fs/cgroup/cpu,cpuacct/cpu.cfs_quota_us
[0.003s][trace][os,container] CPU Quota is: -1
[0.003s][trace][os,container] Path to /cpu.cfs_period_us is 
/sys/fs/cgroup/cpu,cpuacct/cpu.cfs_period_us
[0.003s][trace][os,container] CPU Period is: 10
[0.003s][trace][os,container] OSContainer::active_processor_count: 12
[0.003s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 
12
[0.003s][trace][os,container] total physical memory: 67163226112
[0.003s][debug][os,container] read_string: subsystem path is null
[0.003s][trace][os,container] Memory Limit failed: -2
[0.005s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 
12
[0.021s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 
12
openjdk 24-internal 2025-03-18
OpenJDK Runtime Environment (build 24-internal-adhoc.sgehwolf.jdk-jdk)
OpenJDK 64-Bit Server VM (build 24-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, 
sharing)


On cg v2, on the other hand, `set_subsystem_path()` will never set the path to 
a `null` value.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2468417142


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path

2024-11-11 Thread Severin Gehwolf
On Thu, 7 Nov 2024 13:28:29 GMT, Sergey Chernyshev  
wrote:

> Create a new cgroup for memory
> 
> ```
> sudo mkdir -p /sys/fs/cgroup/memory/test
> ```
> 
> Run the following script
> 
> ```
> docker run --tty=true --rm --volume=$JAVA_HOME:/jdk --memory 400m 
> ubuntu:latest \
> sh -c "sleep 10 ; /jdk/bin/java -Xlog:os+container=trace -version" | grep 
> Memory\ Limit &
> sleep 10
> HOSTPID=$(sudo ps -ef | awk '/container=trace/ && !/docker/ && !/awk/ { print 
> $2 }')
> echo $HOSTPID | sudo tee /sys/fs/cgroup/memory/test/cgroup.procs
> sleep 10
> ```
> 
> In the above script, a containerized process (`/bin/sh`) is moved to cgroup 
> `/test` before `/jdk/bin/java` gets executed. Java inherits cgroup `/test` 
> from its parent process, its `_root` will be `/docker/`, 
> `cgroup_path` will be `/test`.

OK, but why is https://bugs.openjdk.org/browse/JDK-8322420 not in effect in 
such a case?
 
> The result would be ($JAVA_HOME points to JDK before fix)
> 
> ```
> 9804
> [0.001s][trace][os,container] Memory Limit failed: -2
> [0.001s][trace][os,container] Memory Limit failed: -2
> [0.002s][trace][os,container] Memory Limit failed: -2
> [0.043s][trace][os,container] Memory Limit failed: -2
> ```
> 
> JDK updated version:
> 
> ```
> 10001
> [0.001s][trace  ][os,container] Memory Limit is: 419430400
> [0.001s][trace  ][os,container] Memory Limit is: 419430400
> [0.002s][trace  ][os,container] Memory Limit is: 419430400
> [0.035s][trace  ][os,container] Memory Limit is: 419430400
> ```

It would be good to see the full boot JVM output at the trace level. I'm 
wondering why the adjustment isn't sufficient for the use-case the bug 
describes. I.e. if the move happens *before* the JVM starts then there is a 
chance it works OK by detecting some limit. If not it would really be useful to 
understand it better.

If, however, the cgroup move happens after the JVM has started, there is 
nothing in the JVM which "corrects" the detected physical memory (i.e. heap 
size et. al) and/or detected CPUs. It's not supported to do that dynamically.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2467779326


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-11 Thread Severin Gehwolf
On Mon, 11 Nov 2024 10:06:11 GMT, Severin Gehwolf  wrote:

> > I didn't check cg v2 because the issue (NPE) was observed in v1 hosts only.
> 
> The JBS issue doesn't mention `NullPointerException`. It would be good to 
> list the observed NPE issue.

I also wonder, then, if the issue is NPE if 
[JDK-8336881](https://bugs.openjdk.org/browse/JDK-8336881) would fix that 
issue. The controller adjustment doesn't yet happen on the Java (Metrics) 
level. Only hotspot so far.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2467787891


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-11 Thread Severin Gehwolf
On Fri, 8 Nov 2024 16:11:37 GMT, Sergey Chernyshev  
wrote:

> I didn't check cg v2 because the issue (NPE) was observed in v1 hosts only.

The JBS issue doesn't mention `NullPointerException`. It would be good to list 
the observed NPE issue.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2467736667


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

2024-11-08 Thread Severin Gehwolf
On Thu, 7 Nov 2024 22:31:21 GMT, Sergey Chernyshev  
wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
>> certain cases, that may lead to controllers left undetected/inactive. We 
>> observed the behavior in CloudFoundry deployments, it affects also host 
>> systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 
>> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
>> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - 
>> cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due 
>> to load balancing.
>> 
>> Let's examine the condition at line 64 here 
>> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted 
>> earlier by @jerboaa in 
>> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and 
>> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
>> (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
>> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
>> groups states:
>> 
>> 
>> ...
>>/proc/pid/cgroup (since Linux 2.6.24)
>>   This file describes control groups to which the process
>>   with the corresponding PID belongs.  The displayed
>>   information differs for cgroups version 1 and version 2
>>   hierarchies.
>>   For each cgroup hierarchy of which the process is a
>>   member, there is one entry containing three colon-
>>   separated fields:
>> 
>>   hierarchy-ID:controller-list:cgroup-path
>> 
>>   For example:
>> 
>>   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>   [3]  This field contains the pathname of the control group
>>in the hierarchy to which the process belongs. This
>>pathname is relative to the mount point of the
>>hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8343191
>  - patch reimplemented
>  - fix the logic that skips duplicate controller's mount points
>  - 8343191: Cgroup v1 subsystem fails to set subsystem path

Have you checked on cg v2? Is this a problem there as well?

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2464339096


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path

2024-11-07 Thread Severin Gehwolf
On Fri, 1 Nov 2024 13:13:07 GMT, Sergey Chernyshev  
wrote:

> As they're in fact mounting read-write, the logic picked up `rw` mount option 
> and falsely detected "host mode". Also the `--privileged` creates `rw` 
> mounts, so the entire approach needs correction.

Yes. See https://bugs.openjdk.org/browse/JDK-8261242 for details. This patch 
shouldn't change it and the logic of `OSContainer::is_containerized()` 
shouldn't change semantically in all scenarios.

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2454229684


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v12]

2024-10-31 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 37 commits:

 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests
 - Add JVM_HostActiveProcessorCount using JVM api
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - ... and 27 more: https://git.openjdk.org/jdk/compare/c40bb762...8022e864

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=11
  Stats: 1595 lines in 27 files changed: 1344 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path

2024-10-31 Thread Severin Gehwolf
On Thu, 31 Oct 2024 15:00:25 GMT, Sergey Chernyshev  
wrote:

> Cgroup V1 subsustem fails to initialize mounted controllers properly in 
> certain cases, that may lead to controllers left undetected/inactive. We 
> observed the behavior in CloudFoundry deployments, it affects also host 
> systems.
> 
> The relevant /proc/self/mountinfo line is
> 
> 
> 2207 2196 0:43 
> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c 
> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - cgroup 
> cgroup rw,cpu,cpuacct
> 
> 
> /proc/self/cgroup:
> 
> 
> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
> 
> 
> Here, Java runs inside containerized process that is being moved cgroups due 
> to load balancing.
> 
> Let's examine the condition at line 64 here 
> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
> It is always FALSE and the branch is never taken. The issue was spotted 
> earlier by @jerboaa in 
> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
> 
> The original logic was intended to find the common prefix of `_root`and 
> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` 
> (lines 67-68). That could lead to the following results: 
> 
> Example input
> 
> _root = "/a"
> cgroup_path = "/a/b"
> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
> 
> 
> result _path
> 
> "/sys/fs/cgroup/cpu,cpuacct/b"
> 
> 
> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page 
> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control 
> groups states:
> 
> 
> ...
>/proc/pid/cgroup (since Linux 2.6.24)
>   This file describes control groups to which the process
>   with the corresponding PID belongs.  The displayed
>   information differs for cgroups version 1 and version 2
>   hierarchies.
>   For each cgroup hierarchy of which the process is a
>   member, there is one entry containing three colon-
>   separated fields:
> 
>   hierarchy-ID:controller-list:cgroup-path
> 
>   For example:
> 
>   5:cpuacct,cpu,cpuset:/daemons
> ...
>   [3]  This field contains the pathname of the control group
>in the hierarchy to which the process belongs. This
>pathname is relative to the mount point of the
>hierarchy.
> 
> 
> This explicitly states the "pathname is relative to the mount point of the 
> hierarchy". Hence, the correct result could have been
> 
> 
> /sys/fs/cgroup/cpu,cpuacct/a/b
> 
> 
> Howe...

What testing have you done? Did you run existing container tests in:


test/jdk/jdk/internal/platform
test/hotspot/jtreg/containers


As far as I can tell this breaks privileged container runs. I.e. `docker run 
--privileged --memory 400m --memory-swap 400m ... /opt/jdk/bin/java 
-Xlog:os+container=trace` wouldn't pick up the `400m` container limit on CG v1?

-

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2450233596


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v11]

2024-10-31 Thread Severin Gehwolf
On Tue, 22 Oct 2024 14:18:50 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been added in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 36 commits:
> 
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Add exclusive access dirs directive for systemd tests
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Improve systemd slice test for non-root on cg v2
>  - Fix unit tests
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
>  - ... and 26 more: https://git.openjdk.org/jdk/compare/2da7f2bc...a3989e80

Still looking for a reviewer for this.

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2449386120


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v11]

2024-10-22 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 36 commits:

 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests
 - Add JVM_HostActiveProcessorCount using JVM api
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - ... and 26 more: https://git.openjdk.org/jdk/compare/2da7f2bc...a3989e80

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=10
  Stats: 1595 lines in 27 files changed: 1344 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8338411: Implement JEP 486: Permanently Disable the Security Manager

2024-10-15 Thread Severin Gehwolf
On Tue, 15 Oct 2024 16:34:40 GMT, Severin Gehwolf  wrote:

>> This is the implementation of JEP 486: Permanently Disable the Security 
>> Manager. See [JEP 486](https://openjdk.org/jeps/486) for more details. The 
>> [CSR](https://bugs.openjdk.org/browse/JDK-8338412) describes in detail the 
>> main changes in the JEP and also includes an apidiff of the specification 
>> changes.
>> 
>> NOTE: the majority (~95%) of the changes in this PR are test updates 
>> (removal/modifications) and API specification changes, the latter mostly to 
>> remove `@throws SecurityException`. The remaining changes are primarily the 
>> removal of the `SecurityManager`, `Policy`, `AccessController` and other 
>> Security Manager API implementations. There is very little new code.
>> 
>> The code changes can be broken down into roughly the following categories:
>> 
>> 1. Degrading the behavior of Security Manager APIs to either throw 
>> Exceptions by default or provide an execution environment that disallows 
>> access to all resources by default.
>> 2. Changing hundreds of methods and constructors to no longer throw a 
>> `SecurityException` if a Security Manager was enabled. They will operate as 
>> they did in JDK 23 with no Security Manager enabled.
>> 3. Changing the `java` command to exit with a fatal error if a Security 
>> Manager is enabled.
>> 4. Removing the hotspot native code for the privileged stack walk and the 
>> inherited access control context. The remaining hotspot code and tests 
>> related to the Security Manager will be removed immediately after 
>> integration - see [JDK-8341916](https://bugs.openjdk.org/browse/JDK-8341916).
>> 5. Removing or modifying hundreds of tests. Many tests that tested Security 
>> Manager behavior are no longer relevant and thus have been removed or 
>> modified.
>> 
>> There are a handful of Security Manager related tests that are failing and 
>> are at the end of the `test/jdk/ProblemList.txt`, 
>> `test/langtools/ProblemList.txt` and `test/hotspot/jtreg/ProblemList.txt` 
>> files - these will be removed or separate bugs will be filed before 
>> integrating this PR. 
>> 
>> Inside the JDK, we have retained calls to 
>> `SecurityManager::getSecurityManager` and `AccessController::doPrivileged` 
>> for now, as these methods have been degraded to behave the same as they did 
>> in JDK 23 with no Security Manager enabled. After we integrate this JEP, 
>> those calls will be removed in each area (client-libs, core-libs, security, 
>> etc).
>> 
>> I don't expect each reviewer to review all the code changes in this JEP. 
>> Rather, I advise that you only focus on the changes for the area 
>> (client-libs, core-libs, net, ...
>
> Drive-by comment: `java.lang.StackWalker` still has some `checkPermission()` 
> calls that uses:
> 
> 
> SecurityManager sm = System.getSecurityManager();
> if (sm != null) {
> ...
> }
> 
> 
> Intentional?

> @jerboaa Yes - this is intentional:
> 
> > Inside the JDK, we have retained calls to 
> > `SecurityManager::getSecurityManager` and `AccessController::doPrivileged` 
> > for now, as these methods have been degraded to behave the same as they did 
> > in JDK 23 with no Security Manager enabled. After we integrate this JEP, 
> > those calls will be removed in each area (client-libs, core-libs, security, 
> > etc).

OK.

-

PR Comment: https://git.openjdk.org/jdk/pull/21498#issuecomment-2414541204


Re: RFR: 8338411: Implement JEP 486: Permanently Disable the Security Manager

2024-10-15 Thread Severin Gehwolf
On Mon, 14 Oct 2024 13:52:24 GMT, Sean Mullan  wrote:

> This is the implementation of JEP 486: Permanently Disable the Security 
> Manager. See [JEP 486](https://openjdk.org/jeps/486) for more details. The 
> [CSR](https://bugs.openjdk.org/browse/JDK-8338412) describes in detail the 
> main changes in the JEP and also includes an apidiff of the specification 
> changes.
> 
> NOTE: the majority (~95%) of the changes in this PR are test updates 
> (removal/modifications) and API specification changes, the latter mostly to 
> remove `@throws SecurityException`. The remaining changes are primarily the 
> removal of the `SecurityManager`, `Policy`, `AccessController` and other 
> Security Manager API implementations. There is very little new code.
> 
> The code changes can be broken down into roughly the following categories:
> 
> 1. Degrading the behavior of Security Manager APIs to either throw Exceptions 
> by default or provide an execution environment that disallows access to all 
> resources by default.
> 2. Changing hundreds of methods and constructors to no longer throw a 
> `SecurityException` if a Security Manager was enabled. They will operate as 
> they did in JDK 23 with no Security Manager enabled.
> 3. Changing the `java` command to exit with a fatal error if a Security 
> Manager is enabled.
> 4. Removing the hotspot native code for the privileged stack walk and the 
> inherited access control context. The remaining hotspot code and tests 
> related to the Security Manager will be removed immediately after integration 
> - see [JDK-8341916](https://bugs.openjdk.org/browse/JDK-8341916).
> 5. Removing or modifying hundreds of tests. Many tests that tested Security 
> Manager behavior are no longer relevant and thus have been removed or 
> modified.
> 
> There are a handful of Security Manager related tests that are failing and 
> are at the end of the `test/jdk/ProblemList.txt`, 
> `test/langtools/ProblemList.txt` and `test/hotspot/jtreg/ProblemList.txt` 
> files - these will be removed or separate bugs will be filed before 
> integrating this PR. 
> 
> Inside the JDK, we have retained calls to 
> `SecurityManager::getSecurityManager` and `AccessController::doPrivileged` 
> for now, as these methods have been degraded to behave the same as they did 
> in JDK 23 with no Security Manager enabled. After we integrate this JEP, 
> those calls will be removed in each area (client-libs, core-libs, security, 
> etc).
> 
> I don't expect each reviewer to review all the code changes in this JEP. 
> Rather, I advise that you only focus on the changes for the area 
> (client-libs, core-libs, net, security, etc) that you are most f...

Drive-by comment: `java.lang.StackWalker` still has some `checkPermission()` 
calls that uses:


SecurityManager sm = System.getSecurityManager();
if (sm != null) {
...
}


Intentional?

-

PR Review: https://git.openjdk.org/jdk/pull/21498#pullrequestreview-2369966554


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v10]

2024-10-15 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 35 commits:

 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests
 - Add JVM_HostActiveProcessorCount using JVM api
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - ... and 25 more: https://git.openjdk.org/jdk/compare/521effe0...03699b08

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=09
  Stats: 1595 lines in 27 files changed: 1344 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8341138: Rename jtreg property docker.support as container.support [v2]

2024-10-10 Thread Severin Gehwolf
On Thu, 10 Oct 2024 14:27:08 GMT, Ramkumar Sunderbabu  
wrote:

>> The System property "docker.support" defined in VMProps gives a wrong 
>> impression that it is tied to docker alone. The property is common for any 
>> container runtime. Hence, it needs to be renamed as "container.support".
>> 
>> Positive Testing:
>> tier1,tier2,tier3 - to check stability
>> tier5 - to run container tests
>> 
>> Negative Testing:
>> Ran the following groups in hosts where container is not present.
>> open/test/hotspot/jtreg/containers/
>> open/test/jdk/jdk/internal/platform/docker/
>> open/test/jdk/jdk/internal/platform/cgroup/
>
> Ramkumar Sunderbabu has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   addressing review comment

Marked as reviewed by sgehwolf (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/21423#pullrequestreview-2360759661


Re: RFR: 8341138: Rename jtreg property docker.support as container.support

2024-10-09 Thread Severin Gehwolf
On Wed, 9 Oct 2024 14:50:30 GMT, Ramkumar Sunderbabu  
wrote:

> The System property "docker.support" defined in VMProps gives a wrong 
> impression that it is tied to docker alone. The property is common for any 
> container runtime. Hence, it needs to be renamed as "container.support".
> 
> Positive Testing:
> tier1,tier2,tier3 - to check stability
> tier5 - to run container tests
> 
> Negative Testing:
> Ran the following groups in hosts where container is not present.
> open/test/hotspot/jtreg/containers/
> open/test/jdk/jdk/internal/platform/docker/
> open/test/jdk/jdk/internal/platform/cgroup/

Seems OK to me. One minor nit.

test/lib/jdk/test/lib/Container.java line 27:

> 25: 
> 26: public class Container {
> 27: // Use this property to specify container location on your system.

Suggestion:

// Use this property to specify container runtime location (e.g. docker) on 
your system.

-

PR Review: https://git.openjdk.org/jdk/pull/21423#pullrequestreview-2358152114
PR Review Comment: https://git.openjdk.org/jdk/pull/21423#discussion_r1794127601


Integrated: 8341310: Test TestJcmdWithSideCar.java should skip ACCESS_TMP_VIA_PROC_ROOT (after JDK-8327114)

2024-10-03 Thread Severin Gehwolf
On Tue, 1 Oct 2024 16:16:36 GMT, Severin Gehwolf  wrote:

> The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) also 
> increased test coverage. In particular, the `TestJcmdWithSideCar.java` test 
> got enhanced to cover these cases (prior to 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
> tested):
> 
> 1. Shared volumes between attachee and attacher and shared pid namespace
> 2. Shared volumes between attachee and attacher and shared pid namespace, 
> both running with elevated privileges
> 3. Shared pid namespace between attachee and attacher only
> 4. Shared pid namespace between attachee and attacher, both running with 
> elevated privileges
> 
> The OpenJDK attach code is able to handle cases 1 through 3 which pass, but 
> the last case, `4`, hasn't been implemented yet when running as regular user 
> and directing the container runtime to map the container user to that user as 
> well. Thus, the test fails. For now I propose to disable the 4th test case. 
> It can get re-enabled once the product code got updated to account for this 
> case (tracked in https://bugs.openjdk.org/browse/JDK-8341349).
> 
> Thoughts? Could somebody please run this through Oracle's test system in 
> order to see if this fixes the issue? Thank you!

This pull request has now been integrated.

Changeset: 21f8ccf4
Author:Severin Gehwolf 
URL:   
https://git.openjdk.org/jdk/commit/21f8ccf4a97313593f210f9a07e56d5ff92b7aa5
Stats: 9 lines in 1 file changed: 8 ins; 1 del; 0 mod

8341310: Test TestJcmdWithSideCar.java should skip ACCESS_TMP_VIA_PROC_ROOT 
(after JDK-8327114)

Reviewed-by: kevinw

-

PR: https://git.openjdk.org/jdk/pull/21289


Re: RFR: 8341310: Test TestJcmdWithSideCar.java should skip ACCESS_TMP_VIA_PROC_ROOT (after JDK-8327114) [v3]

2024-10-03 Thread Severin Gehwolf
On Wed, 2 Oct 2024 18:46:07 GMT, Severin Gehwolf  wrote:

>> The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) 
>> also increased test coverage. In particular, the `TestJcmdWithSideCar.java` 
>> test got enhanced to cover these cases (prior to 
>> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
>> tested):
>> 
>> 1. Shared volumes between attachee and attacher and shared pid namespace
>> 2. Shared volumes between attachee and attacher and shared pid namespace, 
>> both running with elevated privileges
>> 3. Shared pid namespace between attachee and attacher only
>> 4. Shared pid namespace between attachee and attacher, both running with 
>> elevated privileges
>> 
>> The OpenJDK attach code is able to handle cases 1 through 3 which pass, but 
>> the last case, `4`, hasn't been implemented yet when running as regular user 
>> and directing the container runtime to map the container user to that user 
>> as well. Thus, the test fails. For now I propose to disable the 4th test 
>> case. It can get re-enabled once the product code got updated to account for 
>> this case (tracked in https://bugs.openjdk.org/browse/JDK-8341349).
>> 
>> Thoughts? Could somebody please run this through Oracle's test system in 
>> order to see if this fixes the issue? Thank you!
>
> Severin Gehwolf has updated the pull request incrementally with two 
> additional commits since the last revision:
> 
>  - Revert "Improve runtime of test"
>
>This reverts commit 5b2f646c73b747f6fff364347031074d24e49822.
>  - Revert "Remove the attachee container if it exists"
>
>This reverts commit ef7abf249268c30f726bee19dde3337d92c6493d.

Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2391348632


Re: RFR: 8341310: Test TestJcmdWithSideCar.java should skip ACCESS_TMP_VIA_PROC_ROOT (after JDK-8327114) [v3]

2024-10-03 Thread Severin Gehwolf
On Thu, 3 Oct 2024 10:43:56 GMT, Kevin Walls  wrote:

>> Severin Gehwolf has updated the pull request incrementally with two 
>> additional commits since the last revision:
>> 
>>  - Revert "Improve runtime of test"
>>
>>This reverts commit 5b2f646c73b747f6fff364347031074d24e49822.
>>  - Revert "Remove the attachee container if it exists"
>>
>>This reverts commit ef7abf249268c30f726bee19dde3337d92c6493d.
>
> Good timing, I was just writing:
> 
> Thanks, looks good.  Good to delay the additional changes.
> Would be great if you can change the bug and PR title to something like:
> 8341310: Test TestJcmdWithSideCar.java should skip ACCESS_TMP_VIA_PROC_ROOT 
> (after JDK-8327114)
> (Multiple "test fails after..." bugs are confusing to me at least!)
> 
> With the test as it stands in the repo currently, I am seeing another 
> failure.  I don't get this myself with the change in this PR, but that may 
> just be luck.  
> 
> It's on Linux x64, with TMP_MOUNTED_INTO_SIDECAR, where the command:
> 
> docker run --tty=true --rm --cap-add=SYS_PTRACE --sig-proxy=true 
> --pid=container:test-container-main --cap-add=NET_BIND_SERVICE 
> --user=10668:10668 --volumes-from test-container-main 
> jdk-internal:test-containers-docker-TestJcmdWithSideCar-jfr-jcmd 
> /jdk/bin/jcmd -l 
> 
> ...gets no output, where a good run would show:
> 
> [STDOUT]
> 1 EventGeneratorLoop 120
> 24 jdk.jcmd/sun.tools.jcmd.JCmd -l
> 
> e.g.
> 
> 
> [main-container-process] MAIN_METHOD_STARTED, argument is 120
> Attach strategy TMP_MOUNTED_INTO_SIDECAR
> [COMMAND]
> docker run --tty=true --rm --cap-add=SYS_PTRACE --sig-proxy=true 
> --pid=container:test-container-main --cap-add=NET_BIND_SERVICE 
> --user=10668:10668 --volumes-from test-container-main 
> jdk-internal:test-containers-docker-TestJcmdWithSideCar-jfr-jcmd 
> /jdk/bin/jcmd -l 
> [2024-10-03T04:30:35.273416534Z] Gathering output for process 12125
> [ELAPSED: 1068 ms]
> [STDERR]
> 
> [STDOUT]
> 
> Full child process STDOUT was saved to docker-stdout-12125.log
> [2024-10-03T04:30:36.341378706Z] Waiting for completion for process 12125
> [2024-10-03T04:30:36.341534140Z] Waiting for completion finished for process 
> 12125
> [COMMAND]
> docker rmi --force 
> jdk-internal:test-containers-docker-TestJcmdWithSideCar-jfr-jcmd 
> [2024-10-03T04:30:36.349399928Z] Gathering output for process 12260
> [ELAPSED: 27 ms]
> [STDERR]
> 
> [STDOUT]
> Untagged: jdk-internal:test-containers-docker-TestJcmdWithSideCar-jfr-jcmd
> 
> Full child process STDOUT was saved to docker-stdout-12260.log
> --System.err:(16/748)--
>  stdout: [];
>  stderr: []
>  exitValue = 0
> 
> java.lang.RuntimeException: 'sun.tools.jcmd.JCmd' missing from stdout/stderr
>   at 
> jdk.test.lib.process.OutputAnalyzer.shouldContain(OutputAnalyzer.java:253)
>   at TestJcmdWithSideCar.testCase01(TestJcmdWithSideCar.java:135)
>   at TestJcmdWithSideCar.main(TestJcmdWithSideCar.java:111)
>   at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
>   at java.base/java.lang.reflect.Met...

@kevinjwalls

> Thanks, looks good. Good to delay the additional changes. Would be great if 
> you can change the bug and PR title to something like: 8341310: Test 
> TestJcmdWithSideCar.java should skip ACCESS_TMP_VIA_PROC_ROOT (after 
> JDK-8327114) (Multiple "test fails after..." bugs are confusing to me at 
> least!)

OK. Done.
 
> With the test as it stands in the repo currently, I am seeing another 
> failure. I don't get this myself with the change in this PR, but that may 
> just be luck.

I'm confused. Are you saying that after this patch in revision 
https://github.com/openjdk/jdk/pull/21289/commits/24a4303f75fb854098a74ddf6be98a7981a45cc8
 there are still failures in Oracle's CI?

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2391275788


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114

2024-10-03 Thread Severin Gehwolf
On Tue, 1 Oct 2024 16:56:11 GMT, Kevin Walls  wrote:

> I can check our testing with this change...

@kevinjwalls Any update on this? Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2391044188


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v2]

2024-10-03 Thread Severin Gehwolf
On Wed, 2 Oct 2024 21:15:11 GMT, Larry Cable  wrote:

>> this is a fix to: https://bugs.openjdk.org/browse/JDK-8327114 
>> 
>> to resolve an issue detected in: 
>> 
>> https://bugs.openjdk.org/browse/JDK-8341246
>> 
>> /proc/**/* file accesses should be performed as "privileged" actions to 
>> avoid security mgr exceptions.
>
> Larry Cable has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   JDK-8327114: fix to resolve permissions issue as per: 8341246, also 
> privileged exists and isReadable invocations

Yes, agreed. But we need a new bug for this.

-

PR Comment: https://git.openjdk.org/jdk/pull/21312#issuecomment-2390953016


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114 [v2]

2024-10-03 Thread Severin Gehwolf
On Thu, 3 Oct 2024 05:27:33 GMT, Sebastian Lövdahl  wrote:

> > Filed https://bugs.openjdk.org/browse/JDK-8341436 to track this separate 
> > issue.
> 
> Do you intend to look into this yourself? If not, I would be happy to pick it 
> up.

Please go ahead and have a go at this.

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2390775602


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)

2024-10-03 Thread Severin Gehwolf
On Wed, 2 Oct 2024 19:59:31 GMT, Larry Cable  wrote:

>> this is a fix to: https://bugs.openjdk.org/browse/JDK-8327114 
>> 
>> to resolve an issue detected in: 
>> 
>> https://bugs.openjdk.org/browse/JDK-8341246
>> 
>> /proc/**/* file accesses should be performed as "privileged" actions to 
>> avoid security mgr exceptions.
>
> @kevinjwalls @slovdahl @jerboaa @sendaoYan @dcubed-ojdk

@larry-cable [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) is 
fixed. This PR should get integrated with a new bug. It cannot use the same bug 
unless it's a backport, which this clearly isn't. Also the permission issue is 
fixed with https://github.com/openjdk/jdk/pull/21269 (integrated).

-

PR Comment: https://git.openjdk.org/jdk/pull/21312#issuecomment-2390770485


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114 [v2]

2024-10-02 Thread Severin Gehwolf
On Wed, 2 Oct 2024 18:28:11 GMT, Severin Gehwolf  wrote:

> > Right, this turned into a rabbit hole of its own.. 😁
> 
> Thanks. I'll remove this part from the patch as it's orthogonal to the actual 
> issue and would just like to get the test passing again.

Filed https://bugs.openjdk.org/browse/JDK-8341436 to track this separate issue.

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2389444807


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114 [v3]

2024-10-02 Thread Severin Gehwolf
> The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) also 
> increased test coverage. In particular, the `TestJcmdWithSideCar.java` test 
> got enhanced to cover these cases (prior to 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
> tested):
> 
> 1. Shared volumes between attachee and attacher and shared pid namespace
> 2. Shared volumes between attachee and attacher and shared pid namespace, 
> both running with elevated privileges
> 3. Shared pid namespace between attachee and attacher only
> 4. Shared pid namespace between attachee and attacher, both running with 
> elevated privileges
> 
> The OpenJDK attach code is able to handle cases 1 through 3 which pass, but 
> the last case, `4`, hasn't been implemented yet when running as regular user 
> and directing the container runtime to map the container user to that user as 
> well. Thus, the test fails. For now I propose to disable the 4th test case. 
> It can get re-enabled once the product code got updated to account for this 
> case (tracked in https://bugs.openjdk.org/browse/JDK-8341349).
> 
> Thoughts? Could somebody please run this through Oracle's test system in 
> order to see if this fixes the issue? Thank you!

Severin Gehwolf has updated the pull request incrementally with two additional 
commits since the last revision:

 - Revert "Improve runtime of test"
   
   This reverts commit 5b2f646c73b747f6fff364347031074d24e49822.
 - Revert "Remove the attachee container if it exists"
   
   This reverts commit ef7abf249268c30f726bee19dde3337d92c6493d.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/21289/files
  - new: https://git.openjdk.org/jdk/pull/21289/files/ef7abf24..24a4303f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=21289&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=21289&range=01-02

  Stats: 13 lines in 1 file changed: 3 ins; 10 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/21289.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21289/head:pull/21289

PR: https://git.openjdk.org/jdk/pull/21289


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114 [v2]

2024-10-02 Thread Severin Gehwolf
On Wed, 2 Oct 2024 17:50:44 GMT, Sebastian Lövdahl  wrote:

> Right, this turned into a rabbit hole of its own.. 😁

Thanks. I'll remove this part from the patch as it's orthogonal to the actual 
issue and would just like to get the test passing again.

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2389419514


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114 [v2]

2024-10-02 Thread Severin Gehwolf
> The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) also 
> increased test coverage. In particular, the `TestJcmdWithSideCar.java` test 
> got enhanced to cover these cases (prior to 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
> tested):
> 
> 1. Shared volumes between attachee and attacher and shared pid namespace
> 2. Shared volumes between attachee and attacher and shared pid namespace, 
> both running with elevated privileges
> 3. Shared pid namespace between attachee and attacher only
> 4. Shared pid namespace between attachee and attacher, both running with 
> elevated privileges
> 
> The OpenJDK attach code is able to handle cases 1 through 3 which pass, but 
> the last case, `4`, hasn't been implemented yet when running as regular user 
> and directing the container runtime to map the container user to that user as 
> well. Thus, the test fails. For now I propose to disable the 4th test case. 
> It can get re-enabled once the product code got updated to account for this 
> case (tracked in https://bugs.openjdk.org/browse/JDK-8341349).
> 
> While at it, I've discovered that the `EventGeneratorLoop` running container 
> always runs for a fixed time: `30` seconds. That's irrespective of the 
> attacher containers being done. Therefore, two iterations of the loop 
> spawning this container running the fixed set of time will at least run `60` 
> seconds. In my test runs using `-summary:time` it was `70` seconds:
> 
> 
> Passed: containers/docker/TestJcmdWithSideCar.java
>   build: 2.08 seconds
>   compile: 2.068 seconds
>   build: 4.842 seconds
>   compile: 4.841 seconds
>   driver: 70.776 seconds
> Test results: passed: 1
> 
> 
> I don't think this is needed. We can destroy the process running 
> `EventGeneratorLoop` and then wait for it to exit rather than sitting there 
> and waiting until it is finished (even though we no longer do anything with 
> it). This improvement has been implemented in 
> https://github.com/openjdk/jdk/commit/5b2f646c73b747f6fff364347031074d24e49822
>  (separate commit). After this, the total runtime of the test reduces to 
> about `22` seconds:
> 
> 
> Passed: containers/docker/TestJcmdWithSideCar.java
>   build: 2.169 seconds
>   compile: 2.157 seconds
>   build: 4.964 seconds
>   compile: 4.963 seconds
>   driver: 22.928 seconds
> Test results: passed: 1
> 
> 
> The actual test skip of the 4th test case is: 
> https://github.com/openjdk/jdk/commit/95a7cc05f00e94190af583b9f9dbc659c7be879f
> 
> Thoughts? Could somebody please run this through Oracle's test system in 
> order to see if this fixes the issue?...

Severin Gehwolf has updated the pull request incrementally with one additional 
commit since the last revision:

  Remove the attachee container if it exists

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/21289/files
  - new: https://git.openjdk.org/jdk/pull/21289/files/5b2f646c..ef7abf24

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=21289&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=21289&range=00-01

  Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/21289.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21289/head:pull/21289

PR: https://git.openjdk.org/jdk/pull/21289


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114

2024-10-02 Thread Severin Gehwolf
On Wed, 2 Oct 2024 12:05:36 GMT, Severin Gehwolf  wrote:

> We'll have to do `docker rm -i /test-container-main` before we start a new 
> one. I'll fix that later today.

Updated in 
https://github.com/openjdk/jdk/pull/21289/commits/ef7abf249268c30f726bee19dde3337d92c6493d
 If that doesn't work, I'll remove the test run time optimization.

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2389138766


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114

2024-10-02 Thread Severin Gehwolf
On Wed, 2 Oct 2024 10:23:05 GMT, Sebastian Lövdahl  wrote:

> However, when I'm running the test with `docker` with the changes from this 
> PR (`make test 
> TEST="jtreg:test/hotspot/jtreg/containers/docker/TestJcmdWithSideCar.java" 
> JTREG="JAVA_OPTIONS=-Djdk.test.container.command=docker"`) I get a new 
> failure: 
> [docker-failure.log](https://github.com/user-attachments/files/17227971/docker-failure.log).
>  If I revert the test runtime optimization (and keep the disabling part) the 
> test works with `docker` again.

Thanks. Fails with:


[main-container-process] docker: Error response from daemon: Conflict. The 
container name "/test-container-main" is already in use by container 
"258e8ebbbac586ffcc6a7703bcb977bd7b4c203f4badb2ab08e1651ee546c1d6". You have to 
remove (or rename) that container to be able to reuse that name.
[main-container-process] See 'docker run --help'.
java.lang.RuntimeException: Timed out while waiting for main() to start
at 
TestJcmdWithSideCar$MainContainer.waitForMainMethodStart(TestJcmdWithSideCar.java:285)
at TestJcmdWithSideCar.main(TestJcmdWithSideCar.java:110)
at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:573)
at 
com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:333)
at java.base/java.lang.Thread.run(Thread.java:1576)


We'll have to do `docker rm -i /test-container-main` before we start a new one. 
I'll fix that later today.

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2388475781


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114

2024-10-01 Thread Severin Gehwolf
On Tue, 1 Oct 2024 16:16:36 GMT, Severin Gehwolf  wrote:

> The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) also 
> increased test coverage. In particular, the `TestJcmdWithSideCar.java` test 
> got enhanced to cover these cases (prior to 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
> tested):
> 
> 1. Shared volumes between attachee and attacher and shared pid namespace
> 2. Shared volumes between attachee and attacher and shared pid namespace, 
> both running with elevated privileges
> 3. Shared pid namespace between attachee and attacher only
> 4. Shared pid namespace between attachee and attacher, both running with 
> elevated privileges
> 
> The OpenJDK attach code is able to handle cases 1 through 3 which pass, but 
> the last case, `4`, hasn't been implemented yet when running as regular user 
> and directing the container runtime to map the container user to that user as 
> well. Thus, the test fails. For now I propose to disable the 4th test case. 
> It can get re-enabled once the product code got updated to account for this 
> case (tracked in https://bugs.openjdk.org/browse/JDK-8341349).
> 
> While at it, I've discovered that the `EventGeneratorLoop` running container 
> always runs for a fixed time: `30` seconds. That's irrespective of the 
> attacher containers being done. Therefore, two iterations of the loop 
> spawning this container running the fixed set of time will at least run `60` 
> seconds. In my test runs using `-summary:time` it was `70` seconds:
> 
> 
> Passed: containers/docker/TestJcmdWithSideCar.java
>   build: 2.08 seconds
>   compile: 2.068 seconds
>   build: 4.842 seconds
>   compile: 4.841 seconds
>   driver: 70.776 seconds
> Test results: passed: 1
> 
> 
> I don't think this is needed. We can destroy the process running 
> `EventGeneratorLoop` and then wait for it to exit rather than sitting there 
> and waiting until it is finished (even though we no longer do anything with 
> it). This improvement has been implemented in 
> https://github.com/openjdk/jdk/commit/5b2f646c73b747f6fff364347031074d24e49822
>  (separate commit). After this, the total runtime of the test reduces to 
> about `22` seconds:
> 
> 
> Passed: containers/docker/TestJcmdWithSideCar.java
>   build: 2.169 seconds
>   compile: 2.157 seconds
>   build: 4.964 seconds
>   compile: 4.963 seconds
>   driver: 22.928 seconds
> Test results: passed: 1
> 
> 
> The actual test skip of the 4th test case is: 
> https://github.com/openjdk/jdk/commit/95a7cc05f00e94190af583b9f9dbc659c7be879f
> 
> Thoughts? Could somebody please run this through Oracle's test system in 
> order to see if this fixes the issue?...

Thanks, Kevin.

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2386669079


Re: RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114

2024-10-01 Thread Severin Gehwolf
On Tue, 1 Oct 2024 16:16:36 GMT, Severin Gehwolf  wrote:

> The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) also 
> increased test coverage. In particular, the `TestJcmdWithSideCar.java` test 
> got enhanced to cover these cases (prior to 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
> tested):
> 
> 1. Shared volumes between attachee and attacher and shared pid namespace
> 2. Shared volumes between attachee and attacher and shared pid namespace, 
> both running with elevated privileges
> 3. Shared pid namespace between attachee and attacher only
> 4. Shared pid namespace between attachee and attacher, both running with 
> elevated privileges
> 
> The OpenJDK attach code is able to handle cases 1 through 3 which pass, but 
> the last case, `4`, hasn't been implemented yet when running as regular user 
> and directing the container runtime to map the container user to that user as 
> well. Thus, the test fails. For now I propose to disable the 4th test case. 
> It can get re-enabled once the product code got updated to account for this 
> case (tracked in https://bugs.openjdk.org/browse/JDK-8341349).
> 
> While at it, I've discovered that the `EventGeneratorLoop` running container 
> always runs for a fixed time: `30` seconds. That's irrespective of the 
> attacher containers being done. Therefore, two iterations of the loop 
> spawning this container running the fixed set of time will at least run `60` 
> seconds. In my test runs using `-summary:time` it was `70` seconds:
> 
> 
> Passed: containers/docker/TestJcmdWithSideCar.java
>   build: 2.08 seconds
>   compile: 2.068 seconds
>   build: 4.842 seconds
>   compile: 4.841 seconds
>   driver: 70.776 seconds
> Test results: passed: 1
> 
> 
> I don't think this is needed. We can destroy the process running 
> `EventGeneratorLoop` and then wait for it to exit rather than sitting there 
> and waiting until it is finished (even though we no longer do anything with 
> it). This improvement has been implemented in 
> https://github.com/openjdk/jdk/commit/5b2f646c73b747f6fff364347031074d24e49822
>  (separate commit). After this, the total runtime of the test reduces to 
> about `22` seconds:
> 
> 
> Passed: containers/docker/TestJcmdWithSideCar.java
>   build: 2.169 seconds
>   compile: 2.157 seconds
>   build: 4.964 seconds
>   compile: 4.963 seconds
>   driver: 22.928 seconds
> Test results: passed: 1
> 
> 
> The actual test skip of the 4th test case is: 
> https://github.com/openjdk/jdk/commit/95a7cc05f00e94190af583b9f9dbc659c7be879f
> 
> Thoughts? Could somebody please run this through Oracle's test system in 
> order to see if this fixes the issue?...

PTAL @slovdahl Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/21289#issuecomment-2386454865


RFR: 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after JDK-8327114

2024-10-01 Thread Severin Gehwolf
The change of [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) also 
increased test coverage. In particular, the `TestJcmdWithSideCar.java` test got 
enhanced to cover these cases (prior to 
[JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114) only case 1 was 
tested):

1. Shared volumes between attachee and attacher and shared pid namespace
2. Shared volumes between attachee and attacher and shared pid namespace, both 
running with elevated privileges
3. Shared pid namespace between attachee and attacher only
4. Shared pid namespace between attachee and attacher, both running with 
elevated privileges

The OpenJDK attach code is able to handle cases 1 through 3 which pass, but the 
last case, `4`, hasn't been implemented yet when running as regular user and 
directing the container runtime to map the container user to that user as well. 
Thus, the test fails. For now I propose to disable the 4th test case. It can 
get re-enabled once the product code got updated to account for this case 
(tracked in https://bugs.openjdk.org/browse/JDK-8341349).

While at it, I've discovered that the `EventGeneratorLoop` running container 
always runs for a fixed time: `30` seconds. That's irrespective of the attacher 
containers being done. Therefore, two iterations of the loop spawning this 
container running the fixed set of time will at least run `60` seconds. In my 
test runs using `-summary:time` it was `70` seconds:


Passed: containers/docker/TestJcmdWithSideCar.java
  build: 2.08 seconds
  compile: 2.068 seconds
  build: 4.842 seconds
  compile: 4.841 seconds
  driver: 70.776 seconds
Test results: passed: 1


I don't think this is needed. We can destroy the process running 
`EventGeneratorLoop` and then wait for it to exit rather than sitting there and 
waiting until it is finished (even though we no longer do anything with it). 
This improvement has been implemented in 
https://github.com/openjdk/jdk/commit/5b2f646c73b747f6fff364347031074d24e49822 
(separate commit). After this, the total runtime of the test reduces to about 
`22` seconds:


Passed: containers/docker/TestJcmdWithSideCar.java
  build: 2.169 seconds
  compile: 2.157 seconds
  build: 4.964 seconds
  compile: 4.963 seconds
  driver: 22.928 seconds
Test results: passed: 1


The actual test skip of the 4th test case is: 
https://github.com/openjdk/jdk/commit/95a7cc05f00e94190af583b9f9dbc659c7be879f

Thoughts? Could somebody please run this through Oracle's test system in order 
to see if this fixes the issue? Thank you!

-

Commit messages:
 - Improve runtime of test
 - 8341310: Test containers/docker/TestJcmdWithSideCar.java fails after 
JDK-8327114

Changes: https://git.openjdk.org/jdk/pull/21289/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21289&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8341310
  Stats: 15 lines in 1 file changed: 11 ins; 4 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/21289.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21289/head:pull/21289

PR: https://git.openjdk.org/jdk/pull/21289


Re: RFR: 8341246: Test com/sun/tools/attach/PermissionTest.java fails access denied after JDK-8327114 [v2]

2024-10-01 Thread Severin Gehwolf
On Tue, 1 Oct 2024 15:38:56 GMT, SendaoYan  wrote:

>> Hi all,
>> Test `com/sun/tools/attach/PermissionTest.java` fails access denied after 
>> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114). This testcase 
>> need the `readlink` permission of file `/proc/self/ns/mnt` after 
>> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114).
>> So this PR add `readlink` permission to make this test work normally.
>> Before this PR, test run failed, after this PR, test run success. Test fix 
>> only, no risk.
>
> SendaoYan has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   merge two comments enclosed with /* ... */ on two lines

Marked as reviewed by sgehwolf (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/21269#pullrequestreview-2340853189


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v7]

2024-10-01 Thread Severin Gehwolf
On Tue, 1 Oct 2024 07:33:16 GMT, David Holmes  wrote:

> We are seeing a number of test failures after this was integrated. Failing 
> tests:
> 
> * containers/docker/TestJcmdWithSideCar.java

What's the failure?

> * com/sun/tools/attach/PermissionTest.java
> 
> I will file bugs, but the permission test fails because the new code throws a 
> SecurityException when the SecurityManager is enabled:
> 
> ```
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/proc/self/ns/mnt" "readlink")
>   at 
> java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:488)
>   at 
> java.base/java.security.AccessController.checkPermission(AccessController.java:1085)
>   at 
> java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
>   at 
> java.base/sun.nio.fs.UnixFileSystemProvider.readSymbolicLink(UnixFileSystemProvider.java:554)
>   at java.base/java.nio.file.Files.readSymbolicLink(Files.java:1474)
>   at 
> jdk.attach/sun.tools.attach.VirtualMachineImpl.(VirtualMachineImpl.java:66)
> ```

This is handled here: https://git.openjdk.org/jdk/pull/21269

-

PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2385229823


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v7]

2024-10-01 Thread Severin Gehwolf
On Tue, 1 Oct 2024 09:29:20 GMT, Sebastian Lövdahl  wrote:

> Is this something that should have failed in GHA checks too?

No. Only tier1 tests run in GHA.

-

PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2385367053


Re: RFR: 8341246: Test com/sun/tools/attach/PermissionTest.java fails access denied

2024-09-30 Thread Severin Gehwolf
On Mon, 30 Sep 2024 16:11:27 GMT, SendaoYan  wrote:

> Hi all,
> Test `com/sun/tools/attach/PermissionTest.java` fails access denied after 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114). This testcase 
> need the `readlink` permission of file `/proc/self/ns/mnt` after 
> [JDK-8327114](https://bugs.openjdk.org/browse/JDK-8327114).
> So this PR add `readlink` permission to make this test work normally.
> Before this PR, test run failed, after this PR, test run success. Test fix 
> only, no risk.

test/jdk/com/sun/tools/attach/java.policy.allow line 17:

> 15: 
> 16: /* to read configuration file in META-INF/services, and write/delete 
> .attach_pid */
> 17: /* to read symbol link in /proc/self/ns/mnt */

Suggestion:

/* to read symbolic link of /proc/self/ns/mnt */


Perhaps merge the two comments enclosed with `/* ... */` on two lines?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/21269#discussion_r1781479061


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v9]

2024-09-30 Thread Severin Gehwolf
On Mon, 30 Sep 2024 12:03:14 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been added in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 34 commits:
> 
>  - Add exclusive access dirs directive for systemd tests
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Improve systemd slice test for non-root on cg v2
>  - Fix unit tests
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
>  - ... and 24 more: https://git.openjdk.org/jdk/compare/475b8943...92426dbf

Anyone? Happy to answer any questions you might have :)

Testing run on cgroups v2 with this:


Passed: containers/cgroup/CgroupSubsystemFactory.java
Passed: containers/cgroup/TestContainerized.java
Passed: containers/docker/DockerBasicTest.java
Passed: containers/docker/ShareTmpDir.java
Passed: containers/docker/TestContainerInfo.java
Passed: containers/docker/TestCPUAwareness.java
Passed: containers/docker/TestCPUSets.java
Passed: containers/docker/TestJcmd.java
Passed: containers/docker/TestJcmdWithSideCar.java
Passed: containers/docker/TestJFREvents.java
Passed: containers/docker/TestJFRNetworkEvents.java
Passed: containers/docker/TestJFRWithJMX.java
Passed: containers/docker/TestLimitsUpdating.java
Passed: containers/docker/TestMemoryAwareness.java
Passed: containers/docker/TestMemoryWithCgroupV1.java
Passed: containers/docker/TestMisc.java
Passed: containers/docker/TestPids.java
Passed: containers/systemd/SystemdMemoryAwarenessTest.java
Passed: jdk/internal/platform/cgroup/CgroupSubsystemCpuControllerTest.java
Passed: jdk/internal/platform/cgroup/CgroupSubsystemMemoryControllerTest.java
Passed: jdk/internal/platform/cgroup/CgroupV1SubsystemControllerTest.java
Passed: jdk/internal/platform/cgroup/CgroupV2SubsystemControllerTest.java
Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java
Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java
Passed

Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v9]

2024-09-30 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> added in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 34 commits:

 - Add exclusive access dirs directive for systemd tests
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests
 - Add JVM_HostActiveProcessorCount using JVM api
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - ... and 24 more: https://git.openjdk.org/jdk/compare/475b8943...92426dbf

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=08
  Stats: 1595 lines in 27 files changed: 1344 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v7]

2024-09-30 Thread Severin Gehwolf
On Sun, 29 Sep 2024 06:23:34 GMT, Sebastian Lövdahl  wrote:

>> 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid 
>> (Kubernetes debug container)
>
> Sebastian Lövdahl has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains eight additional 
> commits since the last revision:
> 
>  - Merge remote-tracking branch 'upstream/master' into 
> 8327114-attach-from-container-to-container
>  - Clarify PID 1 check with comment
>  - Adapt code style
>  - Add test for the elevated privileges case
>  - Remove unused `SELF_PID_NS`
>  - Rewrite in line with suggestion from Larry Cable
>  - Reworked attach logic
>  - 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid 
> (Kubernetes debug container)

Marked as reviewed by sgehwolf (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/19055#pullrequestreview-2336685067


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v6]

2024-09-27 Thread Severin Gehwolf
On Tue, 2 Jul 2024 11:07:56 GMT, Sebastian Lövdahl  wrote:

>> 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid 
>> (Kubernetes debug container)
>
> Sebastian Lövdahl has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Clarify PID 1 check with comment

Nice work!

-

Marked as reviewed by sgehwolf (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19055#pullrequestreview-2334232805


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v6]

2024-09-27 Thread Severin Gehwolf
On Wed, 28 Aug 2024 05:05:58 GMT, Sebastian Lövdahl  wrote:

>> Sebastian Lövdahl has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Clarify PID 1 check with comment
>
> Still waiting for another reviewer.

@slovdahl Could you please merge in latest master into your branch and push so 
that the GHA checks are more up-to-date. It looks like the last update was in 
July. Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2379526579


Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) [v6]

2024-09-26 Thread Severin Gehwolf
On Tue, 2 Jul 2024 11:07:56 GMT, Sebastian Lövdahl  wrote:

>> 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid 
>> (Kubernetes debug container)
>
> Sebastian Lövdahl has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Clarify PID 1 check with comment

I'll try to review it today or tomorrow.

-

PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2376291791


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v8]

2024-09-12 Thread Severin Gehwolf
On Wed, 11 Sep 2024 17:52:44 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been added in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 32 commits:
> 
>  - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
>  - Improve systemd slice test for non-root on cg v2
>  - Fix unit tests
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
>  - Adjust test and fix code to return lowest hierarchical limit
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - ... and 22 more: https://git.openjdk.org/jdk/compare/d9fdf69c...0f559199

Anyone willing to review this?

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2346655808


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v8]

2024-09-11 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 32 commits:

 - Merge branch 'master' into jdk-8336881-metrics-systemd-slice
 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests
 - Add JVM_HostActiveProcessorCount using JVM api
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - Adjust test and fix code to return lowest hierarchical limit
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - ... and 22 more: https://git.openjdk.org/jdk/compare/d9fdf69c...0f559199

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=07
  Stats: 1595 lines in 26 files changed: 1344 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v7]

2024-09-10 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request incrementally with two additional 
commits since the last revision:

 - Improve systemd slice test for non-root on cg v2
 - Fix unit tests

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20280/files
  - new: https://git.openjdk.org/jdk/pull/20280/files/5b210068..b60cd0c5

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=05-06

  Stats: 96 lines in 3 files changed: 74 ins; 0 del; 22 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v6]

2024-09-10 Thread Severin Gehwolf
On Thu, 5 Sep 2024 13:25:17 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been proposed in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains eight additional 
> commits since the last revision:
> 
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Adjust test and fix code to return lowest hierarchical limit
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Add root check for SystemdMemoryAwarenessTest
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - 8336881: [Linux] Support for hierarchical limits for Metrics

Thanks. I forgot to update said tests after the latest update. Fixing it now.

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2340949763


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v6]

2024-09-05 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains eight additional 
commits since the last revision:

 - Add JVM_HostActiveProcessorCount using JVM api
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Adjust test and fix code to return lowest hierarchical limit
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Add root check for SystemdMemoryAwarenessTest
 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - 8336881: [Linux] Support for hierarchical limits for Metrics

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20280/files
  - new: https://git.openjdk.org/jdk/pull/20280/files/7beccf23..5b210068

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=04-05

  Stats: 2599 lines in 121 files changed: 2004 ins; 165 del; 430 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v5]

2024-09-03 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains 89 additional commits 
since the last revision:

 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init
 - 8339419: [s390x] Problemlist compiler/c2/irTests/TestIfMinMax.java
   
   Reviewed-by: thartmann
 - 8338916: Build warnings about overriding recipe for jvm-ldflags.txt
   
   Reviewed-by: jwaters, erikj
 - 8339336: Fix build system whitespace to adhere to coding conventions
   
   Reviewed-by: erikj
 - 8339214: Remove misleading CodeBuilder.loadConstant(Opcode, ConstantDesc)
   
   Reviewed-by: asotona
 - 8338740: java/net/httpclient/HttpsTunnelAuthTest.java fails with 
java.io.IOException: HTTP/1.1 header parser received no bytes
   
   Reviewed-by: djelinski, jpai
 - 8325397: sun/java2d/Disposer/TestDisposerRace.java fails in linux-aarch64
   
   Reviewed-by: alanb
 - 8339401: Optimize ClassFile load and store instructions
   
   Reviewed-by: liach, redestad
 - 8339364: AIX build fails: various unused variable and function warnings
   
   Reviewed-by: mdoerr, clanger, jwaters
 - ... and 79 more: https://git.openjdk.org/jdk/compare/e4f2f439...7beccf23

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20280/files
  - new: https://git.openjdk.org/jdk/pull/20280/files/2d918ca4..7beccf23

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=03-04

  Stats: 10214 lines in 362 files changed: 6051 ins; 1713 del; 2450 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v4]

2024-08-28 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains two additional 
commits since the last revision:

 - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
jdk-8336881-metrics-systemd-slice
 - 8336881: [Linux] Support for hierarchical limits for Metrics

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20280/files
  - new: https://git.openjdk.org/jdk/pull/20280/files/f0c775b2..2d918ca4

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=02-03

  Stats: 6199 lines in 321 files changed: 4177 ins; 896 del; 1126 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v2]

2024-08-21 Thread Severin Gehwolf
On Mon, 29 Jul 2024 14:18:24 GMT, Severin Gehwolf  wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been proposed in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains three commits:
> 
>  - 8336881: [Linux] Support for hierarchical limits for Metrics
>  - Unify 4 copies of adjust_controller()
>  - Fix a needless whitespace change

Moving to draft. Needs the hotspot fix which gets a reboot.

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2296180545


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v3]

2024-08-21 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains one commit:

  8336881: [Linux] Support for hierarchical limits for Metrics

-

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=02
  Stats: 1511 lines in 24 files changed: 1260 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v2]

2024-07-29 Thread Severin Gehwolf
> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [x] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Severin Gehwolf has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20280/files
  - new: https://git.openjdk.org/jdk/pull/20280/files/179791a1..179791a1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=00-01

  Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


RFR: 8336881: [Linux] Support for hierarchical limits for Metrics

2024-07-22 Thread Severin Gehwolf
Please review this fix for cgroups-based metrics reporting in the 
`jdk.internal.platform` package. This fix is supposed to address wrong 
reporting of certain limits if the limits aren't set at the leaf nodes.

For example, on cg v2, the memory limit interface file is `memory.max`. 
Consider a cgroup path of  `/a/b/c/d`. The current code only reports the limits 
(via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, some 
systems - like a systemd slice - sets those limits further up the hierarchy. 
For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be set to the 
value `max` (for unlimited), yet `/a/b/c/memory.max` would report the actual 
limit value (e.g. `1048576000`).

This patch addresses this issue by:

1. Refactoring the interface lookup code to relevant controllers for 
cpu/memory. The CgroupSubsystem classes then delegate to those for the lookup. 
This facilitates having an API for the lookup of an updated limit in step 2.
2. Walking the full hierarchy of the cgroup path (if any), looking for a lower 
limit than at the leaf. Note that it's not possible to raise the limit set at a 
path closer to the root via the interface file at a further-to-the-leaf-level. 
The odd case out seems to be `max` values on some systems (which seems to be 
the default value).

As an optimization this hierarchy walk is skipped on containerized systems 
(like K8S), where the limits are set in interface files at the leaf nodes of 
the hierarchy. Therefore there should be no change on those systems.

This patch depends on the Hotspot change implementing the same for the JVM so 
that `Metrics.isContainerized()` works correctly on affected systems where 
`-XshowSettings:system` currently reports `System not containerized` due to the 
missing JVM fix. A test framework for such hierarchical systems has been 
proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
patch adds a test using that framework among some simpler unit tests.

Thoughts?

Testing:

- [ ] GHA
- [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
- [x] Some manual testing using systemd slices

-

Depends on: https://git.openjdk.org/jdk/pull/17198

Commit messages:
 - 8336881: [Linux] Support for hierarchical limits for Metrics

Changes: https://git.openjdk.org/jdk/pull/20280/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20280&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8336881
  Stats: 1511 lines in 24 files changed: 1260 ins; 152 del; 99 mod
  Patch: https://git.openjdk.org/jdk/pull/20280.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20280/head:pull/20280

PR: https://git.openjdk.org/jdk/pull/20280


Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics

2024-07-22 Thread Severin Gehwolf
On Mon, 22 Jul 2024 16:56:00 GMT, Severin Gehwolf  wrote:

> Please review this fix for cgroups-based metrics reporting in the 
> `jdk.internal.platform` package. This fix is supposed to address wrong 
> reporting of certain limits if the limits aren't set at the leaf nodes.
> 
> For example, on cg v2, the memory limit interface file is `memory.max`. 
> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. However, 
> some systems - like a systemd slice - sets those limits further up the 
> hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` might be 
> set to the value `max` (for unlimited), yet `/a/b/c/memory.max` would report 
> the actual limit value (e.g. `1048576000`).
> 
> This patch addresses this issue by:
> 
> 1. Refactoring the interface lookup code to relevant controllers for 
> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
> lookup. This facilitates having an API for the lookup of an updated limit in 
> step 2.
> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
> lower limit than at the leaf. Note that it's not possible to raise the limit 
> set at a path closer to the root via the interface file at a 
> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
> systems (which seems to be the default value).
> 
> As an optimization this hierarchy walk is skipped on containerized systems 
> (like K8S), where the limits are set in interface files at the leaf nodes of 
> the hierarchy. Therefore there should be no change on those systems.
> 
> This patch depends on the Hotspot change implementing the same for the JVM so 
> that `Metrics.isContainerized()` works correctly on affected systems where 
> `-XshowSettings:system` currently reports `System not containerized` due to 
> the missing JVM fix. A test framework for such hierarchical systems has been 
> proposed in [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This 
> patch adds a test using that framework among some simpler unit tests.
> 
> Thoughts?
> 
> Testing:
> 
> - [ ] GHA
> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
> - [x] Some manual testing using systemd slices

Note for reviewers. I'll be away until early August, so my responses might be 
delayed.

-

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2243438170


Re: RFR: 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

2024-07-09 Thread Severin Gehwolf
On Mon, 8 Jul 2024 14:19:21 GMT, Severin Gehwolf  wrote:

> Please review this simple test fix to exclude the test from being run on an 
> Alpine Linux system. Apparently default Alpine Linux systems don't have 
> cgroups set up by default the way other distros do. More info on the bug. I 
> propose to not run the test on musl systems.

Thanks for the reviews!

The docs build failure in GHA is infra related:


Run # If runner architecture is x64 set JAVA_HOME_17_X64 otherwise set to 
JAVA_HOME_17_arm64
[build.sh][INFO] CYGWIN_OR_MSYS=0
[build.sh][INFO] JAVA_HOME: /usr/lib/jvm/temurin-17-jdk-amd64
[build.sh][INFO] Downloading 
https://archive.apache.org/dist/ant/binaries/apache-ant-1.10.8-bin.zip to 
/home/runner/work/jdk/jdk/jtreg/src/make/../build/deps/apache-ant-1.10.8-bin.zip
Error: sh][ERROR] wget exited with exit code 4
Error: Process completed with exit code 1.

-

PR Comment: https://git.openjdk.org/jdk/pull/20076#issuecomment-2217257118


Integrated: 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

2024-07-09 Thread Severin Gehwolf
On Mon, 8 Jul 2024 14:19:21 GMT, Severin Gehwolf  wrote:

> Please review this simple test fix to exclude the test from being run on an 
> Alpine Linux system. Apparently default Alpine Linux systems don't have 
> cgroups set up by default the way other distros do. More info on the bug. I 
> propose to not run the test on musl systems.

This pull request has now been integrated.

Changeset: f3ff4f74
Author:    Severin Gehwolf 
URL:   
https://git.openjdk.org/jdk/commit/f3ff4f7427c3c3f5cb2a115a61462bb9d28de1cd
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod

8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

Reviewed-by: stuefe, mbaesken

-

PR: https://git.openjdk.org/jdk/pull/20076


Re: RFR: 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

2024-07-08 Thread Severin Gehwolf
On Mon, 8 Jul 2024 14:26:29 GMT, Matthias Baesken  wrote:

> Hi Severin, sure ! I put it into our build/test setup .

Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/20076#issuecomment-2214368557


Re: RFR: 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

2024-07-08 Thread Severin Gehwolf
On Mon, 8 Jul 2024 14:19:21 GMT, Severin Gehwolf  wrote:

> Please review this simple test fix to exclude the test from being run on an 
> Alpine Linux system. Apparently default Alpine Linux systems don't have 
> cgroups set up by default the way other distros do. More info on the bug. I 
> propose to not run the test on musl systems.

@MBaesken Since you've noticed the failing test in your CI could you please 
verify the issue is gone with this? I don't have an Alpine Linux setup to 
easily test this exclusion. Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/20076#issuecomment-2214223234


RFR: 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

2024-07-08 Thread Severin Gehwolf
Please review this simple test fix to exclude the test from being run on an 
Alpine Linux system. Apparently default Alpine Linux systems don't have cgroups 
set up by default the way other distros do. More info on the bug. I propose to 
not run the test on musl systems.

-

Commit messages:
 - 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux

Changes: https://git.openjdk.org/jdk/pull/20076/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20076&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8335882
  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/20076.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20076/head:pull/20076

PR: https://git.openjdk.org/jdk/pull/20076


Integrated: 8335775: Remove extraneous 's' in comment of rawmonitor.cpp test file

2024-07-05 Thread Severin Gehwolf
On Fri, 5 Jul 2024 11:14:10 GMT, Severin Gehwolf  wrote:

> Trivial comment only change in a test. Please review!
> 
> Thanks!

This pull request has now been integrated.

Changeset: ff49f677
Author:    Severin Gehwolf 
URL:   
https://git.openjdk.org/jdk/commit/ff49f677ee5017019c90823bc412ceb90068ffbd
Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod

8335775: Remove extraneous 's' in comment of rawmonitor.cpp test file

Reviewed-by: gdams, stuefe

-

PR: https://git.openjdk.org/jdk/pull/20051


Re: RFR: 8335775: Remove extraneous 's' in comment of rawmonitor.cpp test file

2024-07-05 Thread Severin Gehwolf
On Fri, 5 Jul 2024 11:14:10 GMT, Severin Gehwolf  wrote:

> Trivial comment only change in a test. Please review!
> 
> Thanks!

Thanks for the review!

-

PR Comment: https://git.openjdk.org/jdk/pull/20051#issuecomment-2210765310


Re: RFR: 8335643: serviceability/dcmd/vm tests fail for ZGC after JDK-8322475 [v3]

2024-07-05 Thread Severin Gehwolf
On Fri, 5 Jul 2024 10:56:37 GMT, Thomas Stuefe  wrote:

>> The new System.map facility fails its tests when the JVM is using ZGC. The 
>> facility is working fine, but the test checks for the java heap to appear as 
>> committed non-shared memory segment, but on ZGC we reserve the memory as 
>> shared memory.
>
> Thomas Stuefe has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   problemlist

Marked as reviewed by sgehwolf (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/20034#pullrequestreview-2160590870


RFR: 8335775: Remove extraneous 's' in comment of rawmonitor.cpp test file

2024-07-05 Thread Severin Gehwolf
Trivial comment only change in a test. Please review!

Thanks!

-

Commit messages:
 - 8335775: Remove extraneous 's' in comment of rawmonitor.cpp test file

Changes: https://git.openjdk.org/jdk/pull/20051/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20051&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8335775
  Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/20051.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20051/head:pull/20051

PR: https://git.openjdk.org/jdk/pull/20051


Re: RFR: 8335643: serviceability/dcmd/vm tests fail for ZGC after JDK-8322475 [v2]

2024-07-05 Thread Severin Gehwolf
On Fri, 5 Jul 2024 08:53:30 GMT, Thomas Stuefe  wrote:

>> The new System.map facility fails its tests when the JVM is using ZGC. The 
>> facility is working fine, but the test checks for the java heap to appear as 
>> committed non-shared memory segment, but on ZGC we reserve the memory as 
>> shared memory.
>
> Thomas Stuefe has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   problemlist

Seems fine to me.

Apparently also in `ProblemList-generational-zgc.txt`, which needs to remove 
those lines.

-

Marked as reviewed by sgehwolf (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20034#pullrequestreview-2160393678
Changes requested by sgehwolf (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20034#pullrequestreview-2160395464


Re: RFR: 8335643: serviceability/dcmd/vm tests fail for ZGC after JDK-8322475

2024-07-04 Thread Severin Gehwolf
On Thu, 4 Jul 2024 12:57:09 GMT, Thomas Stuefe  wrote:

> The new System.map facility fails its tests when the JVM is using ZGC. The 
> facility is working fine, but the test checks for the java heap to appear as 
> committed non-shared memory segment, but on ZGC we reserve the memory as 
> shared memory.

@tstuefe Please also remove the problem list entry. See the error from the 
skara bot. Thanks!

-

PR Review: https://git.openjdk.org/jdk/pull/20034#pullrequestreview-2160006454


Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v8]

2024-07-01 Thread Severin Gehwolf
On Fri, 28 Jun 2024 15:41:48 GMT, Severin Gehwolf  wrote:

>> Please review this enhancement to the container detection code which allows 
>> it to figure out whether the JVM is actually running inside a container 
>> (`podman`, `docker`, `crio`), or with some other means that enforces 
>> memory/cpu limits by means of the cgroup filesystem. If neither of those 
>> conditions hold, the JVM runs in not containerized mode, addressing the 
>> issue described in the JBS tracker. For example, on my Linux system 
>> `is_containerized() == false" is being indicated with the following trace 
>> log line:
>> 
>> 
>> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false 
>> because no cpu or memory limit is present
>> 
>> 
>> This state is being exposed by the Java `Metrics` API class using the new 
>> (still JDK internal) `isContainerized()` method. Example:
>> 
>> 
>> java -XshowSettings:system --version
>> Operating System Metrics:
>> Provider: cgroupv1
>> System not containerized.
>> openjdk 23-internal 2024-09-17
>> OpenJDK Runtime Environment (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing)
>> 
>> 
>> The basic property this is being built on is the observation that the cgroup 
>> controllers typically get mounted read only into containers. Note that the 
>> current container tests assert that `OSContainer::is_containerized() == 
>> true` in various tests. Therefore, using the heuristic of "is any memory or 
>> cpu limit present" isn't sufficient. I had considered that in an earlier 
>> iteration, but many container tests failed.
>> 
>> Overall, I think, with this patch we improve the current situation of 
>> claiming a containerized system being present when it's actually just a 
>> regular Linux system.
>> 
>> Testing:
>> 
>> - [x] GHA (risc-v failure seems infra related)
>> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 
>> (including gtests)
>> - [x] Some manual testing using cri-o
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Refactor mount info matching to helper function
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Remove problem listing of PlainRead which is reworked here
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Add doc for mountinfo scanning.
>  - Unify naming of variables
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/486aa11e...1017da35

Thank you for the review!

-

PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2199581201


Integrated: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container

2024-07-01 Thread Severin Gehwolf
On Mon, 11 Mar 2024 16:55:36 GMT, Severin Gehwolf  wrote:

> Please review this enhancement to the container detection code which allows 
> it to figure out whether the JVM is actually running inside a container 
> (`podman`, `docker`, `crio`), or with some other means that enforces 
> memory/cpu limits by means of the cgroup filesystem. If neither of those 
> conditions hold, the JVM runs in not containerized mode, addressing the issue 
> described in the JBS tracker. For example, on my Linux system 
> `is_containerized() == false" is being indicated with the following trace log 
> line:
> 
> 
> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false 
> because no cpu or memory limit is present
> 
> 
> This state is being exposed by the Java `Metrics` API class using the new 
> (still JDK internal) `isContainerized()` method. Example:
> 
> 
> java -XshowSettings:system --version
> Operating System Metrics:
> Provider: cgroupv1
> System not containerized.
> openjdk 23-internal 2024-09-17
> OpenJDK Runtime Environment (fastdebug build 
> 23-internal-adhoc.sgehwolf.jdk-jdk)
> OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, 
> mixed mode, sharing)
> 
> 
> The basic property this is being built on is the observation that the cgroup 
> controllers typically get mounted read only into containers. Note that the 
> current container tests assert that `OSContainer::is_containerized() == true` 
> in various tests. Therefore, using the heuristic of "is any memory or cpu 
> limit present" isn't sufficient. I had considered that in an earlier 
> iteration, but many container tests failed.
> 
> Overall, I think, with this patch we improve the current situation of 
> claiming a containerized system being present when it's actually just a 
> regular Linux system.
> 
> Testing:
> 
> - [x] GHA (risc-v failure seems infra related)
> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including 
> gtests)
> - [x] Some manual testing using cri-o
> 
> Thoughts?

This pull request has now been integrated.

Changeset: 0a6ffa57
Author:Severin Gehwolf 
URL:   
https://git.openjdk.org/jdk/commit/0a6ffa57954ddf4f92205205a5a1bada813d127a
Stats: 411 lines in 20 files changed: 305 ins; 79 del; 27 mod

8261242: [Linux] OSContainer::is_containerized() returns true when run outside 
a container

Reviewed-by: stuefe, iklam

-

PR: https://git.openjdk.org/jdk/pull/18201


Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v8]

2024-06-28 Thread Severin Gehwolf
On Fri, 28 Jun 2024 15:41:48 GMT, Severin Gehwolf  wrote:

>> Please review this enhancement to the container detection code which allows 
>> it to figure out whether the JVM is actually running inside a container 
>> (`podman`, `docker`, `crio`), or with some other means that enforces 
>> memory/cpu limits by means of the cgroup filesystem. If neither of those 
>> conditions hold, the JVM runs in not containerized mode, addressing the 
>> issue described in the JBS tracker. For example, on my Linux system 
>> `is_containerized() == false" is being indicated with the following trace 
>> log line:
>> 
>> 
>> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false 
>> because no cpu or memory limit is present
>> 
>> 
>> This state is being exposed by the Java `Metrics` API class using the new 
>> (still JDK internal) `isContainerized()` method. Example:
>> 
>> 
>> java -XshowSettings:system --version
>> Operating System Metrics:
>> Provider: cgroupv1
>> System not containerized.
>> openjdk 23-internal 2024-09-17
>> OpenJDK Runtime Environment (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 
>> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing)
>> 
>> 
>> The basic property this is being built on is the observation that the cgroup 
>> controllers typically get mounted read only into containers. Note that the 
>> current container tests assert that `OSContainer::is_containerized() == 
>> true` in various tests. Therefore, using the heuristic of "is any memory or 
>> cpu limit present" isn't sufficient. I had considered that in an earlier 
>> iteration, but many container tests failed.
>> 
>> Overall, I think, with this patch we improve the current situation of 
>> claiming a containerized system being present when it's actually just a 
>> regular Linux system.
>> 
>> Testing:
>> 
>> - [x] GHA (risc-v failure seems infra related)
>> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 
>> (including gtests)
>> - [x] Some manual testing using cri-o
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Refactor mount info matching to helper function
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Remove problem listing of PlainRead which is reworked here
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Add doc for mountinfo scanning.
>  - Unify naming of variables
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/486aa11e...1017da35

@adinn @iklam Could one of you please help with a second review, please? Not 
sure if @larry-cable review gets recorded with him having no OpenJDK project 
role :-/ Thanks in advance!

-

PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2197212014


Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v7]

2024-06-28 Thread Severin Gehwolf
On Thu, 27 Jun 2024 18:40:09 GMT, Larry Cable  wrote:

>> Severin Gehwolf has updated the pull request with a new target base due to a 
>> merge or a rebase. The pull request now contains 17 commits:
>> 
>>  - Refactor mount info matching to helper function
>>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>>  - Remove problem listing of PlainRead which is reworked here
>>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>>  - Add doc for mountinfo scanning.
>>  - Unify naming of variables
>>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>>  - jcheck fixes
>>  - ... and 7 more: https://git.openjdk.org/jdk/compare/baafa662...532ea33b
>
> src/hotspot/share/prims/jvm.cpp line 504:
> 
>> 502: JVM_LEAF(jboolean, JVM_IsContainerized(void))
>> 503: #ifdef LINUX
>> 504:   if (OSContainer::is_containerized()) {
> 
> // nit: personal preference...
> 
> return OSContainer::isContainerized() ? JNI_TRUE : JNI_FALSE;

I've kept this as is, since the suggestion generates this code after 
preprocessing on Linux:


return OSContainer::is_containerized() ? JNI_TRUE : JNI_FALSE;
return JNI_FALSE;


over the existing version:


if (OSContainer::is_containerized()) {
return JNI_TRUE;
}
return JNI_FALSE;

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1658938198


  1   2   >