Here are a few comments from scanning the webrev.

It looks like you can remove line 151

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java.sdiff.html

151         int[] ints = new int[0];

—————

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/share/classes/jdk/internal/platform/Metrics.java.sdiff.html

There are several comments stating that -2 == unlimited.  This is not the case.

@return count of elapsed periods or -2 if the quota is unlimited.

—————

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html
 
<http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html>

Shouldn’t you just check for cgroupv2 before trying to run the 
testKernelMemoryLimit and testOomKillFlag tests?

—————

There are a few places where getPerCpuUsage is returning “new long[0]” but you 
changed the interface to expect null
for not supported.

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
 
<http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html>

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html
 
<http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html>

You probably need to check all users of the APIs which used to return array[0] 
which now return null to make sure you
don’t get null pointer exceptions.

One example is testCpuConsumption.

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html
 
<http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html>

Also, 
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java.html:167


You’ll also have to update copyrights to 2020.

Bob.


> On Dec 20, 2019, at 9:50 AM, Severin Gehwolf <sgehw...@redhat.com> wrote:
> 
> Hi Bob,
> 
> Sorry for the delay to get this updated.
> 
> Changes done in this latest webrev:
> 
>   1. Rebased on top of 8226575: OperatingSystemMXBean should be made
>      container aware. File read ops now use privileged code.
>   2. Warning for mixed cgroup controllers and returning null for metrics.
>   3. Added implementations for getBlkIOServiceCount() and
>      getBlkIOServiced() using sum of rios/wios and rbytes/wbytes across
>      devices. Added impl for getTcpMemoryUsage()
>   4. Returning -2 for not supported (if the metric would return long).
>      boolean metrics now return Boolean and null if not supported. Same
>      for array return types. null if not supported for symmetry.
>   5. -XshowSettings:system output has been updated to return "N/A" for
>      when a metric is not available. E.g. "Kernel OOM killer enabled"
>      Boolean.
>   6. Tests have been updated to reflect this.
> 
> webrev:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/
> incremental webrev:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/
> 
> Testing: Docker tests and podman testing on cgroupsv2. I'll run it
> through jdk/submit as well.
> 
> Hopefully this can get pushed with 8230305 early on in the jdk 15 cycle
> :)
> 
> A couple of more inline comments below.
> 
> On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote:
>> Sorry for the delay in responding.  See comments below:
>> 
>>> On Nov 29, 2019, at 4:05 AM, Severin Gehwolf <sgehw...@redhat.com> wrote:
>>> 
>>> On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote:
>>>> Hi,
>>>> 
>>>> Could I please get reviews of these core-libs, Linux-only, changes to
>>>> the Metrics subsystem? This patch implements cgroupv2 support for
>>>> Metrics which are currently cgroupv1-only. Fedora 31 switched to
>>>> cgroupv2 by default so it's time to get OpenJDK recognize it.
>>>> 
>>>> Note that a couple of metrics are no longer supported with cgroupv2.
>>>> Most notably (not an exhaustive list, though):
>>>> 
>>>> Metrics.getKernel*() family of methods.
>>>> Metrics.getTcp*() family of methods.
>>>> Metrics.getBlkIO*() family of methods.
>>>> Metrics.isMemoryOOMKillEnabled()
>>>> 
>>>> A couple of open questions with regards to that:
>>>> 
>>>> 1)
>>>> Most API docs of Metrics make no distiction between "unlimited" and
>>>> "not supported", both returning -1 for longs, for example. This is a
>>>> problem, because output of "java -XshowSettings:system -version" will
>>>> not distinguish between unlimited and not supported metrics. Would it
>>>> be acceptable to change the API to distinguish those cases so that
>>>> LauncherHelper could display them appropriately?
>>>> 
>>>> 2)
>>>> How should we deal with "not supported" for booleans/arrays, etc.?
>>>> Would it make sense to return record objects from the Metrics API so
>>>> that this could be dealt with? E.g.
>>>> 
>>>> Metrics m = ...
>>>> MetricResult<int[]> result = m.getCpuSetCpus();
>>>> switch(result.getType()) {
>>>>  case NOT_SUPPORTED: /* do something */; break;
>>>>  case SUPPORTED: int[] val = result.get(); break;
>>>>  ...         
>>>> }
>>>> 
>>>> I'm bringing this up, because the proposed patch doesn't deal with the
>>>> above open questions as of yet. With that being said, it's mostly
>>>> identical to the proposed hotspot changes in [1].
>> 
>> For issue 1 and 2 …
>> 
>> I wonder if we should change the API to throw UnsupportedOperationException 
>> for those methods 
>> that are not available.  This exception is used quite a lot in the java/nio 
>> and java/net packages
>> for dealing with functionality not available on a host platform.
>> 
>> As an alternate suggestion, we already return -2 for "not supported" for 
>> most APIs in the hotspot
>> implementation.  We could use this for non boolean values in the Metrics 
>> API.  For boolean
>> values, we could change the API to return “Boolean”.  A null return would 
>> signify not
>> implemented.
> 
> This alternative has been implemented in the latest webrev.
> LauncherHelper has been updated to print "N/A" if some property being
> printed is not supported. Example output for cgroupsv2:
> 
> $ ./bin/java -XshowSettings:system -version
> Operating System Metrics:
>    Provider: cgroupv2
>    Effective CPU Count: 4
>    CPU Period: 100000us
>    CPU Quota: -1
>    CPU Shares: -1
>    List of Processors: N/A
>    List of Effective Processors: N/A
>    List of Memory Nodes: N/A
>    List of Available Memory Nodes: N/A
>    CPUSet Memory Pressure Enabled: N/A
>    Memory Limit: Unlimited
>    Memory Soft Limit: Unlimited
>    Memory & Swap Limit: Unlimited
>    Kernel Memory Limit: N/A
>    TCP Memory Limit: N/A
>    Out Of Memory Killer Enabled: N/A
> 
> openjdk version "15-internal" 2020-09-15
> OpenJDK Runtime Environment (build 
> 15-internal+0-adhoc.sgehwolf.openjdk-head-2)
> OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, 
> mixed mode, sharing)
> 
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231111
>>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/
>>>> 
>> 
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html
>> 
>> Should we check to make sure that there are no mixed cgroupv1 and cgroupv2 
>> mounted subsystems since
>> you are not supporting mixing.
> 
> Done. null is returned for metrics and a warning printed to stderr.
> 
>> 
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
>> 
>> It looks looks like there may be alternate ways of reporting some of the 
>> metrics that you have classified as RETVAL_NOT_SUPPORTED.
>> 
>> BlkIOServicedCount (io.stat)
>> KernelMemory (memory.stat)
>> TcpMemory (memory.stat)
> 
> The latest webrev has getBlkIOService* and getTcpMemoryUsage impls.
> I've left out kernel memory metrics as it wasn't clear what this would
> be. We can add that in a later patch. The size of this patch is already
> rather big.
> 
>> You could try the same trick for getMemoryAndSwapMaxUsage which keeps track 
>> of the highest returned value.
> 
> We've decided to not do this. getMemoryAndSwapMaxUsage and
> getMemoryMaxUsage is being returned as not supported in cgroups v2.
> 
> Thanks,
> Severin
> 
>> for the benefit of other reviewers, you should provide links to the cgroupv1 
>> and v2 docs.
>> 
>> https://www.kernel.org/doc/Documentation/cgroup-v2.txt
>> 
>>>> Testing: jdk/submit and platform docker tests on Linux x86_64 (with
>>>> hybrid hierarchy, docker/podman) and on Linux x86_64 with unified
>>>> hierarchy (podman only).
>>>> 
>>>> Thoughts? Suggestions?
>> 
>> Do you think we should check the docker version being used for the tests to 
>> make sure it
>> supports cgroupv2?  I assume a fairly recent version is required to work 
>> with cgroupv2.
>> 
>> Bob.
>> 
>> 
>> 
>>> Ping?
>>> 
>>>> Thanks,
>>>> Severin
>>>> 
>>>> [1] 
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html
> 

Reply via email to