On Wed, 18 May 2022 18:14:52 GMT, Severin Gehwolf <sgehw...@openjdk.org> wrote:
>> Please review this change to the cgroup v1 subsystem which makes it more >> resilient on some of the stranger systems. Unfortunately, I wasn't able to >> re-create a similar system as the reporter. The idea of using the longest >> substring match for the cgroupv1 file paths was based on the fact that on >> systemd systems processes run in separate scopes and the maven forked test >> runner might exhibit this property. For that it makes sense to use the >> common ancestor path. Nothing changes in the common cases where the >> `cgroup_path` matches `_root` and when the `_root` is `/` (container the >> former, host system the latter). >> >> In addition, the code paths which are susceptible to throw NPE have been >> hardened to catch those situations. Should it happen in the future it makes >> more sense (to me) to not have accurate container detection in favor of >> continuing to keep running. >> >> Finally, with the added unit-tests a bug was uncovered on the "substring" >> match case of cgroup paths in hotspot. `p` returned from `strstr` can never >> point to `_root` as it's used as the "needle" to find in "haystack" >> `cgroup_path` (not the other way round). >> >> Testing: >> - [x] Added unit tests >> - [x] GHA >> - [x] Container tests on cgroups v1 Linux. Continue to pass > > Severin Gehwolf has updated the pull request incrementally with two > additional commits since the last revision: > > - Refactor hotspot gtest > - Separate into function. Fix comment. Changes requested by iklam (Reviewer). src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 60: > 58: strncpy(buf, _mount_point, MAXPATHLEN); > 59: buf[MAXPATHLEN-1] = '\0'; > 60: _path = os::strdup(buf); Comment on the old code: why this cannot be simply _path = os::strdup(_mount_point); Also, all the strncat calls in this function seem problematic, and should be converted to stringStream for consistency. src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 63: > 61: } else { > 62: char *p = strstr(cgroup_path, _root); > 63: if (p != NULL && p == cgroup_path) { What happens if cgroup_path is "/foo/bar" and _root is "/fo"? Maybe this block should be combined with the new `else` block you're adding? However, the behavior seems opposite between these two blocks of code: The upper block: _root is a prefix of cgroup_path, we take the **tail** of cgroup_path TestCase substring_match = { "/sys/fs/cgroup/memory", // mount_path "/user.slice/user-1000.slice", // root_path "/user.slice/user-1000.slice/user@1001.service", // cgroup_path "/sys/fs/cgroup/memory/user@1001.service" // expected_path }; The lower block: The beginning of _root is a prefix of cgroup_path, we take the **head** of cgroup_path TestCase prefix_matched_cg = { "/sys/fs/cgroup/memory", // mount_path "/user.slice/user-1000.slice/session-50.scope", // root_path "/user.slice/user-1000.slice/session-3.scope", // cgroup_path "/sys/fs/cgroup/memory/user.slice/user-1000.slice" // expected_path }; I find the behavior hard to understand. I think the rules (and reasons) should be added to the comment block above the function. src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 86: > 84: const char* cgroup_p = cgroup_path; > 85: int last_slash = find_last_slash_pos(root_p, cgroup_p); > 86: assert(last_slash >= 0, "not an absolute path?"); Are root_p and cgroup_p directly read from the /proc/xxx files. If so, do we validate the input to make sure they are absolute paths? It seems like our code cannot handle trailing '/' in the input. If so, we should clear all trailing '/' from the input string. Then, in functions that process them, we should assert that they don't end with slash. See my comment in find_last_slash_pos(). src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 102: > 100: for (int i = 0; *s1 == *s2 && *s1 != 0; i++) { > 101: if (*s1 == '/') { > 102: last_matching_slash_pos = i; I found the behavior a little hard to understand. Is it intentional? "/cat/cow", "/cat/dog" -> "/cat/" "/cat/", "/cat/dog" -> "/cat/" "/cat", "/cat/dog" -> "/" The name `find_last_slash_pos` doesn't properly describe the behavior. I thought about `find_common_path_prefix`, but that doesn't match the current behavior (for the 3rd case, the common path prefix is `"/cat"`). ------------- PR: https://git.openjdk.java.net/jdk/pull/8629