Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]

Severin Gehwolf Thu, 18 Apr 2024 02:56:04 -0700

On Wed, 17 Apr 2024 01:07:04 GMT, Jan Kratochvil <jkratoch...@openjdk.org> 
wrote:


>>> IMHO `is_containerized()` is OK to return `false` even when running in a 
>>> container but with no limitations set.
>> 
>> The idea here is to use this property to tune OpenJDK for in-container, 
>> specifically k8s, use. In such a setup it's custom to run a single process 
>> within set resource constraints. In order to do this, we need a reliable way 
>> to distinguish that vs. non-containerized setup. If somebody really wants to 
>> run OpenJDK in a container expecting it to run like a physical OpenJDK 
>> deployment, that's when `-XX:-UseContainerSupport` should be used.
>
>> The idea here is to use this property to tune OpenJDK for in-container, 
>> specifically k8s, use. In such a setup it's custom to run a single process 
>> within set resource constraints.
> 
> The in-container tuning means to use all the available resources. Containers 
> in the real world have some memory limits set which is where my modified 
> patch still correctly identifies it as a container to use all the available 
> resources of the node which is the whole goal of the container detection code.
> 
>> In order to do this, we need a reliable way to distinguish that vs. 
>> non-containerized setup.
> 
> I expect it should have been written "We need a reliable way to distinguish 
> real world in-container vs. non-containerized setup. We do not mind behavior 
> for artificial containers on OpenJDK development machines.". Which is what my 
> patch does in an easier and less error-prone way.
> 
>> If somebody really wants to run OpenJDK in a container expecting it to run 
>> like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` 
>> should be used.
> 
> That behaves still the same with my patch.
> 
> Could you give a countercase where my patch behaves wrongly?

@jankratochvil I believe this boils down to what we actually want. Should 
`OSContainer::is_containerized()` return `false` when run *inside* a container? 
If so, when is it OK to do that? Should `OSContainer::is_containerized()` 
return `true` on a physical Linux deployment? IMO, the read-only property of 
the mount points was something that fit naturally since, we already scan those 
anyway for (cgv1 vs cgv2 detection). Therefore it was something to consider to 
make heuristics more accurate.

The truth table of the patch in this PR looks like this:
| `OSContainer::is_containerized()` value  | Actual deployment scenario |
| ------------- | ------------- |
| `true`  | OpenJDK runs in an unprivileged container **without** a cpu/memory 
limit |
| `true`  | OpenJDK runs in an unprivileged container **with** a cpu/memory 
limit |
| `true`  | OpenJDK runs in a privileged container **with** a cpu/memory limit |
| `false`  | OpenJDK runs in a privileged container **without** a cpu/memory 
limit |
| `false`  | OpenJDK runs in a systemd slice **without** a cpu/memory limit |
| `true`  | OpenJDK runs in a systemd slice **with** a cpu/memory limit |
| `false`  | OpenJDK runs on a physical Linux system (VM or bare metal) |

As you can see, the case of "OpenJDK runs in a privileged container *without* a 
cpu/memory limit" gives the wrong result. However, I consider this a fairly 
uncommon setup and there isn't really anything we can do to detect this kind of 
setup. Even if we did manage to detect it, why would we care? It's a question 
of probability.

> Could you give a countercase where my patch behaves wrongly?

Again, it's not a case of right or wrong IMO. Since we are in the land of 
heuristics, they will be wrong in some cases. We should make them so that we 
cover the common cases and, perhaps, are able to use those in serviceability 
tools to help guide diagnosis and/or further tuning. So far the existing 
capabilities were OK, but prevent further out-of-the-box tuning and/or accurate 
data collection.

Your proposed patch (it's one I had in a previous iteration too, fwiw) would 
also return `false` for the case of "OpenJDK runs in an unprivileged container 
**without** a cpu/memory limit", which seems counterintuitive if OpenJDK 
actually runs in a container! What's more, it seems a fairly common case. Also, 
there is a chance of the OpenJDK heuristics to fail memory/cpu limit detection 
because of bugs and new developments. It seems the safer option to know that 
OpenJDK is containerized (using other heuristics) in that case. Maybe that's 
just me.

Let's have that discussion more broadly and see if we can reach consensus!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2063477204

Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]

Reply via email to