Here are a few comments from scanning the webrev.
It looks like you can remove line 151 http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java.sdiff.html 151 int[] ints = new int[0]; ————— http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/share/classes/jdk/internal/platform/Metrics.java.sdiff.html There are several comments stating that -2 == unlimited. This is not the case. @return count of elapsed periods or -2 if the quota is unlimited. ————— http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html <http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html> Shouldn’t you just check for cgroupv2 before trying to run the testKernelMemoryLimit and testOomKillFlag tests? ————— There are a few places where getPerCpuUsage is returning “new long[0]” but you changed the interface to expect null for not supported. http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html <http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html <http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html> You probably need to check all users of the APIs which used to return array[0] which now return null to make sure you don’t get null pointer exceptions. One example is testCpuConsumption. http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html <http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html> Also, http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java.html:167 You’ll also have to update copyrights to 2020. Bob. > On Dec 20, 2019, at 9:50 AM, Severin Gehwolf <sgehw...@redhat.com> wrote: > > Hi Bob, > > Sorry for the delay to get this updated. > > Changes done in this latest webrev: > > 1. Rebased on top of 8226575: OperatingSystemMXBean should be made > container aware. File read ops now use privileged code. > 2. Warning for mixed cgroup controllers and returning null for metrics. > 3. Added implementations for getBlkIOServiceCount() and > getBlkIOServiced() using sum of rios/wios and rbytes/wbytes across > devices. Added impl for getTcpMemoryUsage() > 4. Returning -2 for not supported (if the metric would return long). > boolean metrics now return Boolean and null if not supported. Same > for array return types. null if not supported for symmetry. > 5. -XshowSettings:system output has been updated to return "N/A" for > when a metric is not available. E.g. "Kernel OOM killer enabled" > Boolean. > 6. Tests have been updated to reflect this. > > webrev: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/ > incremental webrev: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/ > > Testing: Docker tests and podman testing on cgroupsv2. I'll run it > through jdk/submit as well. > > Hopefully this can get pushed with 8230305 early on in the jdk 15 cycle > :) > > A couple of more inline comments below. > > On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote: >> Sorry for the delay in responding. See comments below: >> >>> On Nov 29, 2019, at 4:05 AM, Severin Gehwolf <sgehw...@redhat.com> wrote: >>> >>> On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote: >>>> Hi, >>>> >>>> Could I please get reviews of these core-libs, Linux-only, changes to >>>> the Metrics subsystem? This patch implements cgroupv2 support for >>>> Metrics which are currently cgroupv1-only. Fedora 31 switched to >>>> cgroupv2 by default so it's time to get OpenJDK recognize it. >>>> >>>> Note that a couple of metrics are no longer supported with cgroupv2. >>>> Most notably (not an exhaustive list, though): >>>> >>>> Metrics.getKernel*() family of methods. >>>> Metrics.getTcp*() family of methods. >>>> Metrics.getBlkIO*() family of methods. >>>> Metrics.isMemoryOOMKillEnabled() >>>> >>>> A couple of open questions with regards to that: >>>> >>>> 1) >>>> Most API docs of Metrics make no distiction between "unlimited" and >>>> "not supported", both returning -1 for longs, for example. This is a >>>> problem, because output of "java -XshowSettings:system -version" will >>>> not distinguish between unlimited and not supported metrics. Would it >>>> be acceptable to change the API to distinguish those cases so that >>>> LauncherHelper could display them appropriately? >>>> >>>> 2) >>>> How should we deal with "not supported" for booleans/arrays, etc.? >>>> Would it make sense to return record objects from the Metrics API so >>>> that this could be dealt with? E.g. >>>> >>>> Metrics m = ... >>>> MetricResult<int[]> result = m.getCpuSetCpus(); >>>> switch(result.getType()) { >>>> case NOT_SUPPORTED: /* do something */; break; >>>> case SUPPORTED: int[] val = result.get(); break; >>>> ... >>>> } >>>> >>>> I'm bringing this up, because the proposed patch doesn't deal with the >>>> above open questions as of yet. With that being said, it's mostly >>>> identical to the proposed hotspot changes in [1]. >> >> For issue 1 and 2 … >> >> I wonder if we should change the API to throw UnsupportedOperationException >> for those methods >> that are not available. This exception is used quite a lot in the java/nio >> and java/net packages >> for dealing with functionality not available on a host platform. >> >> As an alternate suggestion, we already return -2 for "not supported" for >> most APIs in the hotspot >> implementation. We could use this for non boolean values in the Metrics >> API. For boolean >> values, we could change the API to return “Boolean”. A null return would >> signify not >> implemented. > > This alternative has been implemented in the latest webrev. > LauncherHelper has been updated to print "N/A" if some property being > printed is not supported. Example output for cgroupsv2: > > $ ./bin/java -XshowSettings:system -version > Operating System Metrics: > Provider: cgroupv2 > Effective CPU Count: 4 > CPU Period: 100000us > CPU Quota: -1 > CPU Shares: -1 > List of Processors: N/A > List of Effective Processors: N/A > List of Memory Nodes: N/A > List of Available Memory Nodes: N/A > CPUSet Memory Pressure Enabled: N/A > Memory Limit: Unlimited > Memory Soft Limit: Unlimited > Memory & Swap Limit: Unlimited > Kernel Memory Limit: N/A > TCP Memory Limit: N/A > Out Of Memory Killer Enabled: N/A > > openjdk version "15-internal" 2020-09-15 > OpenJDK Runtime Environment (build > 15-internal+0-adhoc.sgehwolf.openjdk-head-2) > OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, > mixed mode, sharing) > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231111 >>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/ >>>> >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html >> >> Should we check to make sure that there are no mixed cgroupv1 and cgroupv2 >> mounted subsystems since >> you are not supporting mixing. > > Done. null is returned for metrics and a warning printed to stderr. > >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html >> >> It looks looks like there may be alternate ways of reporting some of the >> metrics that you have classified as RETVAL_NOT_SUPPORTED. >> >> BlkIOServicedCount (io.stat) >> KernelMemory (memory.stat) >> TcpMemory (memory.stat) > > The latest webrev has getBlkIOService* and getTcpMemoryUsage impls. > I've left out kernel memory metrics as it wasn't clear what this would > be. We can add that in a later patch. The size of this patch is already > rather big. > >> You could try the same trick for getMemoryAndSwapMaxUsage which keeps track >> of the highest returned value. > > We've decided to not do this. getMemoryAndSwapMaxUsage and > getMemoryMaxUsage is being returned as not supported in cgroups v2. > > Thanks, > Severin > >> for the benefit of other reviewers, you should provide links to the cgroupv1 >> and v2 docs. >> >> https://www.kernel.org/doc/Documentation/cgroup-v2.txt >> >>>> Testing: jdk/submit and platform docker tests on Linux x86_64 (with >>>> hybrid hierarchy, docker/podman) and on Linux x86_64 with unified >>>> hierarchy (podman only). >>>> >>>> Thoughts? Suggestions? >> >> Do you think we should check the docker version being used for the tests to >> make sure it >> supports cgroupv2? I assume a fairly recent version is required to work >> with cgroupv2. >> >> Bob. >> >> >> >>> Ping? >>> >>>> Thanks, >>>> Severin >>>> >>>> [1] >>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html >