Hi Bob,

On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote:
> I applied your patch to the latest JDK 15 sources and ran the container
> tests on Oracle Linux 8.1 with podman/cgroupv2 enabled.  There were some 
> issues.
> I’m not sure if its my setup or not.  

Looking over them, they appear to be setup issues.

Getting the setup working right was tricky for me as well.

Aside:
podman has a --runtime switch which you could point to a cgroups v2
capable crun binary for example.
https://bugs.openjdk.java.net/browse/JDK-8230305 has some examples of
how to use it.

I also needed some extra setup to get the controllers' delegation to
work:
https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/

> I also ran the same build on Ubuntu with docker/cgroupv1.  I didn't see any 
> failures on
> the cgroupv1 system.

Great, thanks!

> Here are some notes:
> 
> The podman version on OL 8.1 doesn't yes support cgroupv2.  I
> built the latest from sources for this test.
> 
> The docker tests take a very long time under podman!  Longer than
> the cgroupv1 run.

Possibly. I haven't done any fair comparison of the two.

> cpusets and cpusets.mems are blank on host and if none are specified
> on podman/docker run command.  On cgroupv1 they are host values if not
> specified for container.  Effective cpusets and cpusets.mems are set properly
> in a container.

Yes, there seems to be slight differences to cgroup v1. You can only
set cpusets on containers if the host has the controller enabled,
though:

# cat /sys/fs/cgroup/cgroup.subtree_control
cpuset cpu io memory pids
# podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt fedora:30 
/bin/bash
[root@5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' 
-f3)/cpuset.cpus
0-1
[root@5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java -XshowSettings:system -version
Operating System Metrics:
    Provider: cgroupv2
    Effective CPU Count: 2
    CPU Period: 100000us
    CPU Quota: -1
    CPU Shares: -1
    List of Processors, 2 total: 
    0 1 
    List of Effective Processors, 2 total: 
    0 1 
    List of Memory Nodes: N/A
    List of Available Memory Nodes, 1 total: 
    0 
    Memory Limit: 300.00M
    Memory Soft Limit: Unlimited
    Memory & Swap Limit: 600.00M

openjdk version "15-internal" 2020-09-15
OpenJDK Runtime Environment (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2)
OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, 
mixed mode, sharing)


> HOST OUTPUT:
> ./java -XshowSettings:system -version
> Operating System Metrics:
>     Provider: cgroupv2
>     Effective CPU Count: 32
>     CPU Period: -1
>     CPU Quota: -1
>     CPU Shares: -1
>     List of Processors: N/A
>     List of Effective Processors: N/A
>     List of Memory Nodes: N/A
>     List of Available Memory Nodes: N/A
>     Memory Limit: Unlimited
>     Memory Soft Limit: Unlimited
>     Memory & Swap Limit: Unlimited
> 
> CONTAINER OUTPUT:
> # podman run -it -v `pwd`:/mnt ubuntu bash
> root@3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version
> Operating System Metrics:
>     Provider: cgroupv2
>     Effective CPU Count: 32
>     CPU Period: 100000us
>     CPU Quota: -1
>     CPU Shares: -1
>     List of Processors: N/A
>     List of Effective Processors, 32 total: 
>     0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 
> 28 29 30 31 
>     List of Memory Nodes: N/A
>     List of Available Memory Nodes, 1 total: 
>     0 
>     Memory Limit: Unlimited
>     Memory Soft Limit: Unlimited
>     Memory & Swap Limit: Unlimited

Yes, host and container output differ depending on the configured test
system. I remember that I had cpusets working on the host system too at
some point, but I forgot what the magic was to get this properly
delegated via systemd.

> Docker tests fail if /bin/docker is not available in podman setup.  We
> probably should enhance the docker check to also look for podman.

Could you be more specific about this? How do you run docker tests? I
use:

-e PATH -verbose:summary -Djdk.test.container.command=podman 
-Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30

with -Djdk.test.container.command=podman it shouldn't need docker. Did
you specify that property?

> Two container tests failed:
> 
> FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead
> of unlimited or -1.  This is because memory.max is not foumd.

This doesn't fail for me, because I've got memory.max present on host:

# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers
cpu io memory pids
[root@f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' 
-f3)/memory.max
max

Note, howevre, this is a hotspot test. So support for it for cgroup v2
came with JDK-8230305. It seems we should return -1 if memory.max is
not found (over -2, not supported). Could you file a bug for this? It's
unrelated to this change. It should be a simple fix.

> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java
> This fails because nr_periods line doesn't always exist.  I think you’ve got
> to enable a quota for this to appear (not sure).  

Passes for me, but it needs the cpu controller enabled on the test
system.

# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers
cpu io memory pids
# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.stat
usage_usec 1157537
user_usec 606832
system_usec 550705
nr_periods 0
nr_throttled 0
throttled_usec 0

> Here’s the contents:
> % more cpu.stat
> usage_usec 23974562755
> user_usec 22257183568
> system_usec 1717379186

It suggests you've got the cpu controller disabled:
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#cpu

"""
and the following three when the controller is enabled:


nr_periods
nr_throttled
throttled_usec
"""

> CGROUPV2 results on Oracle Linux 8.1
> ---------
> 
> testing container APIs
> 
> FAILED: containers/cgroup/PlainRead.java
> Passed: containers/docker/DockerBasicTest.java
> Passed: containers/docker/TestCPUAwareness.java
> Passed: containers/docker/TestCPUSets.java
> Passed: containers/docker/TestJcmdWithSideCar.java
> Passed: containers/docker/TestJFREvents.java
> Passed: containers/docker/TestJFRNetworkEvents.java
> Passed: containers/docker/TestMemoryAwareness.java
> Passed: containers/docker/TestMisc.java
> Test results: passed: 8; failed: 1
> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork
> Error: Some tests failed or other problems occurred.

These are hotspot tests. Covered by JDK-8230305 (hotspot changes). The
plain read test passes on a properly configured host system with
controller delegation.

> testing jdk.internal.platform APIs
> 
> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java
> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java
> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java
> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java
> Passed: jdk/internal/platform/docker/TestSystemMetrics.java
> Test results: passed: 4; failed: 1
> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork
> Error: Some tests failed or other problems occurred.

I believe cgroup/TestCgroupMetrics.java fails due to bad host config.
It passes here:

[root@f31 jdk-jdk]# rm -rf JTwork/ JTreport && /media/disk/jtreg/bin/jtreg 
-timeout:4 -jdk:../cgroupsv2-jdk/ -e PATH -verbose:summary 
-Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora 
-Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform
Directory "JTwork" not found: creating
Directory "JTreport" not found: creating
Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java
Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java
Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java
Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java
Passed: jdk/internal/platform/docker/TestSystemMetrics.java
Test results: passed: 5
Report written to /home/sgehwolf/jdk-jdk/JTreport/html/report.html
Results written to /home/sgehwolf/jdk-jdk/JTwork

JTR files available here:
http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/

> testing -XshowSettings:system launcher option
> 
> Passed: tools/launcher/Settings.java
> Test results: passed: 1

Thanks for running the tests!

FWIW, jdk-submit came back clean. If we could get the initial support
of this pushed soon it would be great. I'd be happy to fix any follow-
up issues.

Thanks,
Severin

> Bob.
> 
> > On Feb 11, 2020, at 1:04 PM, Severin Gehwolf <sgehw...@redhat.com> wrote:
> > 
> > Hi Mandy, Bob,
> > 
> > Thanks again for the reviews and patience on this. Sorry it took me so
> > long to get back to this :-/
> > 
> > Updated webrev:
> > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/
> > incremental: 
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/
> > 
> > I've tested this with docker tests on cgroup v1 and via podman on a
> > cgroup v2 system. They pass. I'll be running this through jdk-submit as
> > well.
> > 
> > More below.
> > 
> > On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote:
> > > Hi Severin,
> > > 
> > > Thanks for the update.
> > > 
> > > On 1/21/20 11:30 AM, Severin Gehwolf wrote:
> > > > Full: 
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/
> > > > incremental: 
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/
> > > > 
> > > 
> > > I have answered my own question.   Most of the metrics used to return 0 
> > > if unavailable due to IOException reading a file or malformed content and 
> > > now are changed to return -2 due to error fetching the metric.
> > > 
> > > The following are about limits which used to return -1 if unlimited or no 
> > > limit set.
> > >    public long getCpuQuota();
> > >    public long getCpuShares();
> > >    public long getMemoryLimit();
> > >    public long getMemoryAndSwapLimit();
> > >    public long getMemorySoftLimit();
> > > 
> > > With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to 
> > > return -1 if unlimited or no limit set.  However the implementation does 
> > > return -1.  All of the above specify to return -2 if unavailable due to 
> > > error fetching the metric. 
> > > 
> > > I found the implementation quite hard to follow.  I spent some time 
> > > reviewing the code to see if the implementation matches the spec  but I 
> > > can't easily tell yet.  For example,
> > > CgroupSubsystemController::getLongValueMatchingLine returns -1 when 
> > > IOException occurs.
> > > CgroupSubsystemController::getLongEntry returns 0L if IOException occurs.
> > > 
> > > CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE
> > > when value overflows
> > 
> > This one is intentional. It's mapped back to unlimited via
> > longValOrUnlimited(). The reason for this is that cgroup v1 doesn't
> > have a concept of "unlimited". Unlimited values will be a very large
> > numbers in cgroup v1 files.
> > 
> > > CgroupV2SubsystemController::convertStringToLong returns -1 when 
> > > IOException occurs
> > > 
> > > CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024
> > > CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0
> > 
> > These two are special cases too. See the implementation note of
> > Metrics.getCpuShares(). In the cgroup v2 case the default value is 100
> > (over 1024 in cgroup v1). That's why unlimited is being returned for
> > those values.
> > 
> > > CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max 
> > > or malformed content
> > > CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error
> > >   This is called by getBlkIOServiceCount and getBlkIOServiced
> > > 
> > > I think this can be improved and add the documentation to describe
> > > what the methods do.  Since Metrics APIs consistently return -2 if 
> > > unavailable due to error in fetching the metric, why some utility
> > > methods in *Subsystem and *SubsystemController return -1 upon error
> > > and 0 when unlimited?
> > > 
> > > I suspect if the getXXXValue and other methods are clearly documented
> > > with the error cases (possibly renaming the method name if appropriate)
> > > CgroupV1Subsystem and CgroupV2SubSystem will become very explicit 
> > > to understand.
> > 
> > This should be fixed now.
> > 
> > I've gone through the API doc of Metrics.java and have updated it. In
> > general, I've updated it to return -1 if metric is unavailable (due to
> > error in reading some files or content being empty), and -2 if not
> > supported. No method returns -2 currently, but it might change and it's
> > good to have some way of saying "not implementable" for this subsystem
> > in the spec. That's my take on it anyway.
> > 
> > There is also a new unit test for shared controller logic:
> > TestCgroupSubsystemController.java
> > 
> > It execises various cases of error/success.
> > 
> > That is to ensure proper symmetry across the various cases (including
> > IOException). I've also documented static methods in
> > CgroupSubsystemController. Overall, all methods now return the same
> > values for cgroup v1 and cgroup v2 (given the impl nuances) for the
> > various cases.
> > 
> > > CgroupSubsystem.java
> > > 
> > >  44     public static final double DOUBLE_RETVAL_NOT_SUPPORTED = 
> > > LONG_RETVAL_NOT_SUPPORTED;
> > >  49     public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null;
> > > 
> > > They are no longer needed, right?
> > 
> > Removed.
> > 
> > > CgroupSubsystemFactory.java
> > > 
> > >  89             System.err.println("Warning: Mixed cgroupv1 and cgroupv2 
> > > not supported. Metrics disabled.");
> > > 
> > > 
> > > I expect this be a System.Logger log 
> > 
> > Updated.
> > 
> > > 114                     if 
> > > (!Integer.valueOf(0).toString().equals(tokens[0])) {
> > > 
> > > This can be simplified to if (!"0".equals(tokens[0]))
> > 
> > Done, thanks!
> > 
> > > LauncherHelper.java
> > > 
> > > 407         // Extended cgroupv1 specific metrics
> > > 408         if (c instanceof MetricsCgroupV1) {
> > > 409             MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c;
> > > 410             limit = cgroupV1.getKernelMemoryLimit();
> > > 411             ostream.println(formatLimitString(limit, INDENT + "Kernel 
> > > Memory Limit: "));
> > > 412             limit = cgroupV1.getTcpMemoryLimit();
> > > 413             ostream.println(formatLimitString(limit, INDENT + "TCP 
> > > Memory Limit: "));
> > > 414             Boolean value = cgroupV1.isMemoryOOMKillEnabled();
> > > 415             ostream.println(formatBoolean(value, INDENT + "Out Of 
> > > Memory Killer Enabled: "));
> > > 416             value = cgroupV1.isCpuSetMemoryPressureEnabled();
> > > 417             ostream.println(formatBoolean(value, INDENT + "CPUSet 
> > > Memory Pressure Enabled: "));
> > > 418         }
> > > 
> > > MetricsCgroupV1 is linux-only.  It will fail the compilation when
> > > building on non-linux.  One option is to move this code to 
> > >   src/java.base/linux/share/sun/launcher/CgroupMetrics.java
> > > 
> > > Are they continued to be interesting metrics to be output from
> > > -XshowSetting?  I wonder if they can simply be dropped from the output.
> > > Bob will have an opinion.
> > 
> > I've removed those extra cgroup v1 specific metrics printed via
> > -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's
> > only used in tests in webrev 10. On the other hand the idea would be
> > for consumers to downcast it to MetricsCgroupV1 if they needed those
> > extra metrics.
> > 
> > Thanks,
> > Severin
> > 

Reply via email to