> On Feb 12, 2020, at 6:22 AM, Severin Gehwolf <sgehw...@redhat.com> wrote:
> 
> Hi Bob,
> 
> On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote:
>> I applied your patch to the latest JDK 15 sources and ran the container
>> tests on Oracle Linux 8.1 with podman/cgroupv2 enabled.  There were some 
>> issues.
>> I’m not sure if its my setup or not.  
> 
> Looking over them, they appear to be setup issues.
> 
> Getting the setup working right was tricky for me as well.
> 
> Aside:
> podman has a --runtime switch which you could point to a cgroups v2
> capable crun binary for example.
> https://bugs.openjdk.java.net/browse/JDK-8230305 has some examples of
> how to use it.
> 
> I also needed some extra setup to get the controllers' delegation to
> work:
> https://urldefense.com/v3/__https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/__;!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GXp0Ri0Wg$
>  
> 
>> I also ran the same build on Ubuntu with docker/cgroupv1.  I didn't see any 
>> failures on
>> the cgroupv1 system.
> 
> Great, thanks!
> 
>> Here are some notes:
>> 
>> The podman version on OL 8.1 doesn't yes support cgroupv2.  I
>> built the latest from sources for this test.
>> 
>> The docker tests take a very long time under podman!  Longer than
>> the cgroupv1 run.
> 
> Possibly. I haven't done any fair comparison of the two.

It’s possible that it’s trying to access several container hubs.  I may be 
getting some timeouts
since I haven’t figured out how to specify proxies for podman.  I’ve set a 
local http_proxy env variable
but maybe that’s not working.

> 
>> cpusets and cpusets.mems are blank on host and if none are specified
>> on podman/docker run command.  On cgroupv1 they are host values if not
>> specified for container.  Effective cpusets and cpusets.mems are set properly
>> in a container.
> 
> Yes, there seems to be slight differences to cgroup v1. You can only
> set cpusets on containers if the host has the controller enabled,
> though:
> 
> # cat /sys/fs/cgroup/cgroup.subtree_control
> cpuset cpu io memory pids
> # podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt fedora:30 
> /bin/bash
> [root@5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' 
> -f3)/cpuset.cpus
> 0-1
> [root@5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java -XshowSettings:system 
> -version
> Operating System Metrics:
>    Provider: cgroupv2
>    Effective CPU Count: 2
>    CPU Period: 100000us
>    CPU Quota: -1
>    CPU Shares: -1
>    List of Processors, 2 total: 
>    0 1 
>    List of Effective Processors, 2 total: 
>    0 1 
>    List of Memory Nodes: N/A
>    List of Available Memory Nodes, 1 total: 
>    0 
>    Memory Limit: 300.00M
>    Memory Soft Limit: Unlimited
>    Memory & Swap Limit: 600.00M
> 
> openjdk version "15-internal" 2020-09-15
> OpenJDK Runtime Environment (build 
> 15-internal+0-adhoc.sgehwolf.openjdk-head-2)
> OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, 
> mixed mode, sharing)
> 

In my setup, your command above works correctly.  The only thing that doesn’t 
work is if I run the -XshowSettings
directly on the host OR I run a container without specifying —cpuset-cpus.  The 
default setup seems wrong.

> 
>> HOST OUTPUT:
>> ./java -XshowSettings:system -version
>> Operating System Metrics:
>>    Provider: cgroupv2
>>    Effective CPU Count: 32
>>    CPU Period: -1
>>    CPU Quota: -1
>>    CPU Shares: -1
>>    List of Processors: N/A
>>    List of Effective Processors: N/A
>>    List of Memory Nodes: N/A
>>    List of Available Memory Nodes: N/A
>>    Memory Limit: Unlimited
>>    Memory Soft Limit: Unlimited
>>    Memory & Swap Limit: Unlimited
>> 
>> CONTAINER OUTPUT:
>> # podman run -it -v `pwd`:/mnt ubuntu bash
>> root@3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version
>> Operating System Metrics:
>>    Provider: cgroupv2
>>    Effective CPU Count: 32
>>    CPU Period: 100000us
>>    CPU Quota: -1
>>    CPU Shares: -1
>>    List of Processors: N/A
>>    List of Effective Processors, 32 total: 
>>    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 
>> 28 29 30 31 
>>    List of Memory Nodes: N/A
>>    List of Available Memory Nodes, 1 total: 
>>    0 
>>    Memory Limit: Unlimited
>>    Memory Soft Limit: Unlimited
>>    Memory & Swap Limit: Unlimited
> 
> Yes, host and container output differ depending on the configured test
> system. I remember that I had cpusets working on the host system too at
> some point, but I forgot what the magic was to get this properly
> delegated via systemd.
> 
>> Docker tests fail if /bin/docker is not available in podman setup.  We
>> probably should enhance the docker check to also look for podman.
> 
> Could you be more specific about this? How do you run docker tests? I
> use:
> 
> -e PATH -verbose:summary -Djdk.test.container.command=podman 
> -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30
> 
> with -Djdk.test.container.command=podman it shouldn't need docker. Did
> you specify that property?

I thought that most podman setups try to provide docker compatibility.  I did 
not try using the properties.
In order to make our automation work cleaner, I was hoping that we could always 
just execute docker.

> 
>> Two container tests failed:
>> 
>> FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead
>> of unlimited or -1.  This is because memory.max is not foumd.
> 
> This doesn't fail for me, because I've got memory.max present on host:
> 
> # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' 
> -f3)/cgroup.controllers
> cpu io memory pids
> [root@f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' 
> -f3)/memory.max
> max
> 
> Note, howevre, this is a hotspot test. So support for it for cgroup v2
> came with JDK-8230305. It seems we should return -1 if memory.max is
> not found (over -2, not supported). Could you file a bug for this? It's
> unrelated to this change. It should be a simple fix.

Here are my controllers:
cpuset cpu io memory pids rdma

This seems like another case where this file doesn’t exist in the path we form 
on the host.

[0.035s][debug][os,container] 
/sys/fs/cgroup/user.slice/user-23603.slice/session-14.scope/cpu.max failed, No 
such file or directory

It does exist here:  /sys/fs/cgroup/user.slice so maybe it’s a delegation issue.

> 
>> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java
>> This fails because nr_periods line doesn't always exist.  I think you’ve got
>> to enable a quota for this to appear (not sure).  
> 
> Passes for me, but it needs the cpu controller enabled on the test
> system.
> 
> # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' 
> -f3)/cgroup.controllers
> cpu io memory pids
> # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.stat
> usage_usec 1157537
> user_usec 606832
> system_usec 550705
> nr_periods 0
> nr_throttled 0
> throttled_usec 0
> 
>> Here’s the contents:
>> % more cpu.stat
>> usage_usec 23974562755
>> user_usec 22257183568
>> system_usec 1717379186
> 
> It suggests you've got the cpu controller disabled:
> https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html*cpu__;Iw!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GWZKkNZMA$
>  
> 
> """
> and the following three when the controller is enabled:
> 
> 
> nr_periods
> nr_throttled
> throttled_usec
> “""

I do have the cpu controller enabled.  I’ll look into your fixes for delegation 
issues and try again.

Bob.

> 
>> CGROUPV2 results on Oracle Linux 8.1
>> ---------
>> 
>> testing container APIs
>> 
>> FAILED: containers/cgroup/PlainRead.java
>> Passed: containers/docker/DockerBasicTest.java
>> Passed: containers/docker/TestCPUAwareness.java
>> Passed: containers/docker/TestCPUSets.java
>> Passed: containers/docker/TestJcmdWithSideCar.java
>> Passed: containers/docker/TestJFREvents.java
>> Passed: containers/docker/TestJFRNetworkEvents.java
>> Passed: containers/docker/TestMemoryAwareness.java
>> Passed: containers/docker/TestMisc.java
>> Test results: passed: 8; failed: 1
>> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork
>> Error: Some tests failed or other problems occurred.
> 
> These are hotspot tests. Covered by JDK-8230305 (hotspot changes). The
> plain read test passes on a properly configured host system with
> controller delegation.
> 
>> testing jdk.internal.platform APIs
>> 
>> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java
>> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java
>> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java
>> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java
>> Passed: jdk/internal/platform/docker/TestSystemMetrics.java
>> Test results: passed: 4; failed: 1
>> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork
>> Error: Some tests failed or other problems occurred.
> 
> I believe cgroup/TestCgroupMetrics.java fails due to bad host config.
> It passes here:
> 
> [root@f31 jdk-jdk]# rm -rf JTwork/ JTreport && /media/disk/jtreg/bin/jtreg 
> -timeout:4 -jdk:../cgroupsv2-jdk/ -e PATH -verbose:summary 
> -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora 
> -Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform
> Directory "JTwork" not found: creating
> Directory "JTreport" not found: creating
> Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java
> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java
> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java
> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java
> Passed: jdk/internal/platform/docker/TestSystemMetrics.java
> Test results: passed: 5
> Report written to /home/sgehwolf/jdk-jdk/JTreport/html/report.html
> Results written to /home/sgehwolf/jdk-jdk/JTwork
> 
> JTR files available here:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/
> 
>> testing -XshowSettings:system launcher option
>> 
>> Passed: tools/launcher/Settings.java
>> Test results: passed: 1
> 
> Thanks for running the tests!
> 
> FWIW, jdk-submit came back clean. If we could get the initial support
> of this pushed soon it would be great. I'd be happy to fix any follow-
> up issues.
> 
> Thanks,
> Severin
> 
>> Bob.
>> 
>>> On Feb 11, 2020, at 1:04 PM, Severin Gehwolf <sgehw...@redhat.com> wrote:
>>> 
>>> Hi Mandy, Bob,
>>> 
>>> Thanks again for the reviews and patience on this. Sorry it took me so
>>> long to get back to this :-/
>>> 
>>> Updated webrev:
>>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/
>>> incremental: 
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/
>>> 
>>> I've tested this with docker tests on cgroup v1 and via podman on a
>>> cgroup v2 system. They pass. I'll be running this through jdk-submit as
>>> well.
>>> 
>>> More below.
>>> 
>>> On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote:
>>>> Hi Severin,
>>>> 
>>>> Thanks for the update.
>>>> 
>>>> On 1/21/20 11:30 AM, Severin Gehwolf wrote:
>>>>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/
>>>>> incremental: 
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/
>>>>> 
>>>> 
>>>> I have answered my own question.   Most of the metrics used to return 0 if 
>>>> unavailable due to IOException reading a file or malformed content and now 
>>>> are changed to return -2 due to error fetching the metric.
>>>> 
>>>> The following are about limits which used to return -1 if unlimited or no 
>>>> limit set.
>>>>   public long getCpuQuota();
>>>>   public long getCpuShares();
>>>>   public long getMemoryLimit();
>>>>   public long getMemoryAndSwapLimit();
>>>>   public long getMemorySoftLimit();
>>>> 
>>>> With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to 
>>>> return -1 if unlimited or no limit set.  However the implementation does 
>>>> return -1.  All of the above specify to return -2 if unavailable due to 
>>>> error fetching the metric. 
>>>> 
>>>> I found the implementation quite hard to follow.  I spent some time 
>>>> reviewing the code to see if the implementation matches the spec  but I 
>>>> can't easily tell yet.  For example,
>>>> CgroupSubsystemController::getLongValueMatchingLine returns -1 when 
>>>> IOException occurs.
>>>> CgroupSubsystemController::getLongEntry returns 0L if IOException occurs.
>>>> 
>>>> CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE
>>>> when value overflows
>>> 
>>> This one is intentional. It's mapped back to unlimited via
>>> longValOrUnlimited(). The reason for this is that cgroup v1 doesn't
>>> have a concept of "unlimited". Unlimited values will be a very large
>>> numbers in cgroup v1 files.
>>> 
>>>> CgroupV2SubsystemController::convertStringToLong returns -1 when 
>>>> IOException occurs
>>>> 
>>>> CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024
>>>> CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0
>>> 
>>> These two are special cases too. See the implementation note of
>>> Metrics.getCpuShares(). In the cgroup v2 case the default value is 100
>>> (over 1024 in cgroup v1). That's why unlimited is being returned for
>>> those values.
>>> 
>>>> CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or 
>>>> malformed content
>>>> CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error
>>>>  This is called by getBlkIOServiceCount and getBlkIOServiced
>>>> 
>>>> I think this can be improved and add the documentation to describe
>>>> what the methods do.  Since Metrics APIs consistently return -2 if 
>>>> unavailable due to error in fetching the metric, why some utility
>>>> methods in *Subsystem and *SubsystemController return -1 upon error
>>>> and 0 when unlimited?
>>>> 
>>>> I suspect if the getXXXValue and other methods are clearly documented
>>>> with the error cases (possibly renaming the method name if appropriate)
>>>> CgroupV1Subsystem and CgroupV2SubSystem will become very explicit 
>>>> to understand.
>>> 
>>> This should be fixed now.
>>> 
>>> I've gone through the API doc of Metrics.java and have updated it. In
>>> general, I've updated it to return -1 if metric is unavailable (due to
>>> error in reading some files or content being empty), and -2 if not
>>> supported. No method returns -2 currently, but it might change and it's
>>> good to have some way of saying "not implementable" for this subsystem
>>> in the spec. That's my take on it anyway.
>>> 
>>> There is also a new unit test for shared controller logic:
>>> TestCgroupSubsystemController.java
>>> 
>>> It execises various cases of error/success.
>>> 
>>> That is to ensure proper symmetry across the various cases (including
>>> IOException). I've also documented static methods in
>>> CgroupSubsystemController. Overall, all methods now return the same
>>> values for cgroup v1 and cgroup v2 (given the impl nuances) for the
>>> various cases.
>>> 
>>>> CgroupSubsystem.java
>>>> 
>>>> 44     public static final double DOUBLE_RETVAL_NOT_SUPPORTED = 
>>>> LONG_RETVAL_NOT_SUPPORTED;
>>>> 49     public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null;
>>>> 
>>>> They are no longer needed, right?
>>> 
>>> Removed.
>>> 
>>>> CgroupSubsystemFactory.java
>>>> 
>>>> 89             System.err.println("Warning: Mixed cgroupv1 and cgroupv2 
>>>> not supported. Metrics disabled.");
>>>> 
>>>> 
>>>> I expect this be a System.Logger log 
>>> 
>>> Updated.
>>> 
>>>> 114                     if 
>>>> (!Integer.valueOf(0).toString().equals(tokens[0])) {
>>>> 
>>>> This can be simplified to if (!"0".equals(tokens[0]))
>>> 
>>> Done, thanks!
>>> 
>>>> LauncherHelper.java
>>>> 
>>>> 407         // Extended cgroupv1 specific metrics
>>>> 408         if (c instanceof MetricsCgroupV1) {
>>>> 409             MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c;
>>>> 410             limit = cgroupV1.getKernelMemoryLimit();
>>>> 411             ostream.println(formatLimitString(limit, INDENT + "Kernel 
>>>> Memory Limit: "));
>>>> 412             limit = cgroupV1.getTcpMemoryLimit();
>>>> 413             ostream.println(formatLimitString(limit, INDENT + "TCP 
>>>> Memory Limit: "));
>>>> 414             Boolean value = cgroupV1.isMemoryOOMKillEnabled();
>>>> 415             ostream.println(formatBoolean(value, INDENT + "Out Of 
>>>> Memory Killer Enabled: "));
>>>> 416             value = cgroupV1.isCpuSetMemoryPressureEnabled();
>>>> 417             ostream.println(formatBoolean(value, INDENT + "CPUSet 
>>>> Memory Pressure Enabled: "));
>>>> 418         }
>>>> 
>>>> MetricsCgroupV1 is linux-only.  It will fail the compilation when
>>>> building on non-linux.  One option is to move this code to 
>>>>  src/java.base/linux/share/sun/launcher/CgroupMetrics.java
>>>> 
>>>> Are they continued to be interesting metrics to be output from
>>>> -XshowSetting?  I wonder if they can simply be dropped from the output.
>>>> Bob will have an opinion.
>>> 
>>> I've removed those extra cgroup v1 specific metrics printed via
>>> -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's
>>> only used in tests in webrev 10. On the other hand the idea would be
>>> for consumers to downcast it to MetricsCgroupV1 if they needed those
>>> extra metrics.
>>> 
>>> Thanks,
>>> Severin
>>> 
> 

Reply via email to