On Fri, 15 Dec 2023 11:57:51 GMT, Joachim Kern <jk...@openjdk.org> wrote:
>> On AIX, repeated calls to dlopen referring to the same shared library may >> result in different, unique dl handles to be returned from libc. In that it >> differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. >> One such problem is in the JVMTI agent handler. That problem was fixed with >> a local fix to said handler >> ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this >> fix causes follow-up problems since it assumes that the file name passed to >> `os::dll_load()` is the file that has been opened. It prevents efficient, >> os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. >> See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it >> is a hack that causes other, more uglier hacks to follow (see discussion of >> https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and >> os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached >> handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See >> [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other >> platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with two additional > commits since the last revision: > > - trailing whitespace > - Following most of Thomas proposals I like this, this is good. Small nits remain. src/hotspot/os/aix/os_aix.cpp line 30: > 28: #pragma alloca > 29: > 30: please remove whitespace change src/hotspot/os/aix/os_aix.cpp line 193: > 191: // local variables > 192: > 193: please remove whitespace change src/hotspot/os/aix/os_aix.cpp line 1113: > 1111: } > 1112: > 1113: please remove whitespace change src/hotspot/os/aix/porting_aix.cpp line 934: > 932: struct scnhdr the_scn; > 933: struct ldhdr the_ldr; > 934: size_t sz = FILHSZ + _AOUTHSZ_EXEC; please rename to xcoffsz, and make constexpr: `constexpr size_t xcoffsz = ...` src/hotspot/os/aix/porting_aix.cpp line 990: > 988: if (env == nullptr) { > 989: // no LIBPATH, try with LD_LIBRARY_PATH > 990: env = getenv("LD_LIBRARY_PATH"); Is LD_LIBRARY_PATH a thing on AIX? I thought it is only used on non-AIX. src/hotspot/os/aix/porting_aix.cpp line 1005: > 1003: // LIBPATH or LD_LIBRARY_PATH and second with burned in libpath. > 1004: // No check against current working directory > 1005: Libpath.print("%s:%s", env, rtv_linkedin_libpath()); Are you sure libpath env var has precedence over the baked-in libpath? src/hotspot/os/aix/porting_aix.cpp line 1097: > 1095: } > 1096: > 1097: pthread_mutex_lock(&g_handletable_mutex); You can make your life a lot easier by defining an RAII object at the start of the file: struct TableLocker { TableLocker() { pthread_mutex_lock(&g_handletable_mutex); } ~TableLocker() { pthread_mutex_unlock(&g_handletable_mutex); } }; and just place this at the beginning of your two functions TableLocker lock: ... no need to manually unlock then, with the danger of missing a return. src/hotspot/os/aix/porting_aix.cpp line 1101: > 1099: for (i = 0; i < g_handletable_used; i++) { > 1100: if (g_handletable[i].handle == libhandle) { > 1101: // handle found, decrease refcount `assert(refcount > 0, "Sanity"))` src/hotspot/os/aix/porting_aix.cpp line 1143: > 1141: // entry of the array to the place of the entry we want to remove > and overwrite it > 1142: if (i < g_handletable_used) { > 1143: g_handletable[i] = g_handletable[g_handletable_used]; To be super careful, I would zero out at least the handle of the moved item like this: `g_handletable[g_handletable_used].handle = nullptr` ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1786400492 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870755 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870833 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870885 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429849403 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429858465 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429859923 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429868182 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429863665 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870057