> On Feb 12, 2020, at 6:22 AM, Severin Gehwolf <sgehw...@redhat.com> wrote: > > Hi Bob, > > On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote: >> I applied your patch to the latest JDK 15 sources and ran the container >> tests on Oracle Linux 8.1 with podman/cgroupv2 enabled. There were some >> issues. >> I’m not sure if its my setup or not. > > Looking over them, they appear to be setup issues. > > Getting the setup working right was tricky for me as well. > > Aside: > podman has a --runtime switch which you could point to a cgroups v2 > capable crun binary for example. > https://bugs.openjdk.java.net/browse/JDK-8230305 has some examples of > how to use it. > > I also needed some extra setup to get the controllers' delegation to > work: > https://urldefense.com/v3/__https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/__;!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GXp0Ri0Wg$ > > >> I also ran the same build on Ubuntu with docker/cgroupv1. I didn't see any >> failures on >> the cgroupv1 system. > > Great, thanks! > >> Here are some notes: >> >> The podman version on OL 8.1 doesn't yes support cgroupv2. I >> built the latest from sources for this test. >> >> The docker tests take a very long time under podman! Longer than >> the cgroupv1 run. > > Possibly. I haven't done any fair comparison of the two.
It’s possible that it’s trying to access several container hubs. I may be getting some timeouts since I haven’t figured out how to specify proxies for podman. I’ve set a local http_proxy env variable but maybe that’s not working. > >> cpusets and cpusets.mems are blank on host and if none are specified >> on podman/docker run command. On cgroupv1 they are host values if not >> specified for container. Effective cpusets and cpusets.mems are set properly >> in a container. > > Yes, there seems to be slight differences to cgroup v1. You can only > set cpusets on containers if the host has the controller enabled, > though: > > # cat /sys/fs/cgroup/cgroup.subtree_control > cpuset cpu io memory pids > # podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt fedora:30 > /bin/bash > [root@5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > -f3)/cpuset.cpus > 0-1 > [root@5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java -XshowSettings:system > -version > Operating System Metrics: > Provider: cgroupv2 > Effective CPU Count: 2 > CPU Period: 100000us > CPU Quota: -1 > CPU Shares: -1 > List of Processors, 2 total: > 0 1 > List of Effective Processors, 2 total: > 0 1 > List of Memory Nodes: N/A > List of Available Memory Nodes, 1 total: > 0 > Memory Limit: 300.00M > Memory Soft Limit: Unlimited > Memory & Swap Limit: 600.00M > > openjdk version "15-internal" 2020-09-15 > OpenJDK Runtime Environment (build > 15-internal+0-adhoc.sgehwolf.openjdk-head-2) > OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, > mixed mode, sharing) > In my setup, your command above works correctly. The only thing that doesn’t work is if I run the -XshowSettings directly on the host OR I run a container without specifying —cpuset-cpus. The default setup seems wrong. > >> HOST OUTPUT: >> ./java -XshowSettings:system -version >> Operating System Metrics: >> Provider: cgroupv2 >> Effective CPU Count: 32 >> CPU Period: -1 >> CPU Quota: -1 >> CPU Shares: -1 >> List of Processors: N/A >> List of Effective Processors: N/A >> List of Memory Nodes: N/A >> List of Available Memory Nodes: N/A >> Memory Limit: Unlimited >> Memory Soft Limit: Unlimited >> Memory & Swap Limit: Unlimited >> >> CONTAINER OUTPUT: >> # podman run -it -v `pwd`:/mnt ubuntu bash >> root@3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version >> Operating System Metrics: >> Provider: cgroupv2 >> Effective CPU Count: 32 >> CPU Period: 100000us >> CPU Quota: -1 >> CPU Shares: -1 >> List of Processors: N/A >> List of Effective Processors, 32 total: >> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 >> 28 29 30 31 >> List of Memory Nodes: N/A >> List of Available Memory Nodes, 1 total: >> 0 >> Memory Limit: Unlimited >> Memory Soft Limit: Unlimited >> Memory & Swap Limit: Unlimited > > Yes, host and container output differ depending on the configured test > system. I remember that I had cpusets working on the host system too at > some point, but I forgot what the magic was to get this properly > delegated via systemd. > >> Docker tests fail if /bin/docker is not available in podman setup. We >> probably should enhance the docker check to also look for podman. > > Could you be more specific about this? How do you run docker tests? I > use: > > -e PATH -verbose:summary -Djdk.test.container.command=podman > -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 > > with -Djdk.test.container.command=podman it shouldn't need docker. Did > you specify that property? I thought that most podman setups try to provide docker compatibility. I did not try using the properties. In order to make our automation work cleaner, I was hoping that we could always just execute docker. > >> Two container tests failed: >> >> FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead >> of unlimited or -1. This is because memory.max is not foumd. > > This doesn't fail for me, because I've got memory.max present on host: > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > -f3)/cgroup.controllers > cpu io memory pids > [root@f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > -f3)/memory.max > max > > Note, howevre, this is a hotspot test. So support for it for cgroup v2 > came with JDK-8230305. It seems we should return -1 if memory.max is > not found (over -2, not supported). Could you file a bug for this? It's > unrelated to this change. It should be a simple fix. Here are my controllers: cpuset cpu io memory pids rdma This seems like another case where this file doesn’t exist in the path we form on the host. [0.035s][debug][os,container] /sys/fs/cgroup/user.slice/user-23603.slice/session-14.scope/cpu.max failed, No such file or directory It does exist here: /sys/fs/cgroup/user.slice so maybe it’s a delegation issue. > >> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java >> This fails because nr_periods line doesn't always exist. I think you’ve got >> to enable a quota for this to appear (not sure). > > Passes for me, but it needs the cpu controller enabled on the test > system. > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > -f3)/cgroup.controllers > cpu io memory pids > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.stat > usage_usec 1157537 > user_usec 606832 > system_usec 550705 > nr_periods 0 > nr_throttled 0 > throttled_usec 0 > >> Here’s the contents: >> % more cpu.stat >> usage_usec 23974562755 >> user_usec 22257183568 >> system_usec 1717379186 > > It suggests you've got the cpu controller disabled: > https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html*cpu__;Iw!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GWZKkNZMA$ > > > """ > and the following three when the controller is enabled: > > > nr_periods > nr_throttled > throttled_usec > “"" I do have the cpu controller enabled. I’ll look into your fixes for delegation issues and try again. Bob. > >> CGROUPV2 results on Oracle Linux 8.1 >> --------- >> >> testing container APIs >> >> FAILED: containers/cgroup/PlainRead.java >> Passed: containers/docker/DockerBasicTest.java >> Passed: containers/docker/TestCPUAwareness.java >> Passed: containers/docker/TestCPUSets.java >> Passed: containers/docker/TestJcmdWithSideCar.java >> Passed: containers/docker/TestJFREvents.java >> Passed: containers/docker/TestJFRNetworkEvents.java >> Passed: containers/docker/TestMemoryAwareness.java >> Passed: containers/docker/TestMisc.java >> Test results: passed: 8; failed: 1 >> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork >> Error: Some tests failed or other problems occurred. > > These are hotspot tests. Covered by JDK-8230305 (hotspot changes). The > plain read test passes on a properly configured host system with > controller delegation. > >> testing jdk.internal.platform APIs >> >> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java >> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java >> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java >> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java >> Passed: jdk/internal/platform/docker/TestSystemMetrics.java >> Test results: passed: 4; failed: 1 >> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork >> Error: Some tests failed or other problems occurred. > > I believe cgroup/TestCgroupMetrics.java fails due to bad host config. > It passes here: > > [root@f31 jdk-jdk]# rm -rf JTwork/ JTreport && /media/disk/jtreg/bin/jtreg > -timeout:4 -jdk:../cgroupsv2-jdk/ -e PATH -verbose:summary > -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora > -Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform > Directory "JTwork" not found: creating > Directory "JTreport" not found: creating > Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java > Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java > Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java > Passed: jdk/internal/platform/docker/TestSystemMetrics.java > Test results: passed: 5 > Report written to /home/sgehwolf/jdk-jdk/JTreport/html/report.html > Results written to /home/sgehwolf/jdk-jdk/JTwork > > JTR files available here: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/ > >> testing -XshowSettings:system launcher option >> >> Passed: tools/launcher/Settings.java >> Test results: passed: 1 > > Thanks for running the tests! > > FWIW, jdk-submit came back clean. If we could get the initial support > of this pushed soon it would be great. I'd be happy to fix any follow- > up issues. > > Thanks, > Severin > >> Bob. >> >>> On Feb 11, 2020, at 1:04 PM, Severin Gehwolf <sgehw...@redhat.com> wrote: >>> >>> Hi Mandy, Bob, >>> >>> Thanks again for the reviews and patience on this. Sorry it took me so >>> long to get back to this :-/ >>> >>> Updated webrev: >>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ >>> incremental: >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ >>> >>> I've tested this with docker tests on cgroup v1 and via podman on a >>> cgroup v2 system. They pass. I'll be running this through jdk-submit as >>> well. >>> >>> More below. >>> >>> On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: >>>> Hi Severin, >>>> >>>> Thanks for the update. >>>> >>>> On 1/21/20 11:30 AM, Severin Gehwolf wrote: >>>>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ >>>>> incremental: >>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ >>>>> >>>> >>>> I have answered my own question. Most of the metrics used to return 0 if >>>> unavailable due to IOException reading a file or malformed content and now >>>> are changed to return -2 due to error fetching the metric. >>>> >>>> The following are about limits which used to return -1 if unlimited or no >>>> limit set. >>>> public long getCpuQuota(); >>>> public long getCpuShares(); >>>> public long getMemoryLimit(); >>>> public long getMemoryAndSwapLimit(); >>>> public long getMemorySoftLimit(); >>>> >>>> With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to >>>> return -1 if unlimited or no limit set. However the implementation does >>>> return -1. All of the above specify to return -2 if unavailable due to >>>> error fetching the metric. >>>> >>>> I found the implementation quite hard to follow. I spent some time >>>> reviewing the code to see if the implementation matches the spec but I >>>> can't easily tell yet. For example, >>>> CgroupSubsystemController::getLongValueMatchingLine returns -1 when >>>> IOException occurs. >>>> CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. >>>> >>>> CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE >>>> when value overflows >>> >>> This one is intentional. It's mapped back to unlimited via >>> longValOrUnlimited(). The reason for this is that cgroup v1 doesn't >>> have a concept of "unlimited". Unlimited values will be a very large >>> numbers in cgroup v1 files. >>> >>>> CgroupV2SubsystemController::convertStringToLong returns -1 when >>>> IOException occurs >>>> >>>> CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 >>>> CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 >>> >>> These two are special cases too. See the implementation note of >>> Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 >>> (over 1024 in cgroup v1). That's why unlimited is being returned for >>> those values. >>> >>>> CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or >>>> malformed content >>>> CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error >>>> This is called by getBlkIOServiceCount and getBlkIOServiced >>>> >>>> I think this can be improved and add the documentation to describe >>>> what the methods do. Since Metrics APIs consistently return -2 if >>>> unavailable due to error in fetching the metric, why some utility >>>> methods in *Subsystem and *SubsystemController return -1 upon error >>>> and 0 when unlimited? >>>> >>>> I suspect if the getXXXValue and other methods are clearly documented >>>> with the error cases (possibly renaming the method name if appropriate) >>>> CgroupV1Subsystem and CgroupV2SubSystem will become very explicit >>>> to understand. >>> >>> This should be fixed now. >>> >>> I've gone through the API doc of Metrics.java and have updated it. In >>> general, I've updated it to return -1 if metric is unavailable (due to >>> error in reading some files or content being empty), and -2 if not >>> supported. No method returns -2 currently, but it might change and it's >>> good to have some way of saying "not implementable" for this subsystem >>> in the spec. That's my take on it anyway. >>> >>> There is also a new unit test for shared controller logic: >>> TestCgroupSubsystemController.java >>> >>> It execises various cases of error/success. >>> >>> That is to ensure proper symmetry across the various cases (including >>> IOException). I've also documented static methods in >>> CgroupSubsystemController. Overall, all methods now return the same >>> values for cgroup v1 and cgroup v2 (given the impl nuances) for the >>> various cases. >>> >>>> CgroupSubsystem.java >>>> >>>> 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = >>>> LONG_RETVAL_NOT_SUPPORTED; >>>> 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; >>>> >>>> They are no longer needed, right? >>> >>> Removed. >>> >>>> CgroupSubsystemFactory.java >>>> >>>> 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 >>>> not supported. Metrics disabled."); >>>> >>>> >>>> I expect this be a System.Logger log >>> >>> Updated. >>> >>>> 114 if >>>> (!Integer.valueOf(0).toString().equals(tokens[0])) { >>>> >>>> This can be simplified to if (!"0".equals(tokens[0])) >>> >>> Done, thanks! >>> >>>> LauncherHelper.java >>>> >>>> 407 // Extended cgroupv1 specific metrics >>>> 408 if (c instanceof MetricsCgroupV1) { >>>> 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; >>>> 410 limit = cgroupV1.getKernelMemoryLimit(); >>>> 411 ostream.println(formatLimitString(limit, INDENT + "Kernel >>>> Memory Limit: ")); >>>> 412 limit = cgroupV1.getTcpMemoryLimit(); >>>> 413 ostream.println(formatLimitString(limit, INDENT + "TCP >>>> Memory Limit: ")); >>>> 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); >>>> 415 ostream.println(formatBoolean(value, INDENT + "Out Of >>>> Memory Killer Enabled: ")); >>>> 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); >>>> 417 ostream.println(formatBoolean(value, INDENT + "CPUSet >>>> Memory Pressure Enabled: ")); >>>> 418 } >>>> >>>> MetricsCgroupV1 is linux-only. It will fail the compilation when >>>> building on non-linux. One option is to move this code to >>>> src/java.base/linux/share/sun/launcher/CgroupMetrics.java >>>> >>>> Are they continued to be interesting metrics to be output from >>>> -XshowSetting? I wonder if they can simply be dropped from the output. >>>> Bob will have an opinion. >>> >>> I've removed those extra cgroup v1 specific metrics printed via >>> -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's >>> only used in tests in webrev 10. On the other hand the idea would be >>> for consumers to downcast it to MetricsCgroupV1 if they needed those >>> extra metrics. >>> >>> Thanks, >>> Severin >>> >