On Mon, 2 Nov 2020 22:13:30 GMT, Daniel D. Daugherty <dcu...@openjdk.org> wrote:
>> Changes from @fisk and @dcubed-ojdk to: >> >> - simplify ObjectMonitor list management >> - get rid of Type-Stable Memory (TSM) >> >> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8; no new >> regressions. >> Aurora Perf runs have also been done (DaCapo-h2, Quick Startup/Footprint, >> SPECjbb2015-Tuned-G1, SPECjbb2015-Tuned-ParGC, Volano) >> - a few minor regressions (<= -0.24%) >> - Volano is 6.8% better >> >> Eric C. has also running promotion perf runs on these bits and says "the >> results look fine". > > Self review done. ### Gory details about these changes from @fisk and @dcubed-ojdk: ### Simplify `ObjectMonitor` List Management: - delete per-thread in-use and free-lists. - delete global free-list and global wait-list; there is a still a global in-use list; after `ObjectMonitor`s on the global in-use list are deflated, they are unlinked and added to a function local `GrowableArray` called `delete_list`; we do a handshake/safepoint with all JavaThreads and that makes all the `ObjectMonitor`s on the `delete_list` safe for deletion; lastly, we delete all the `ObjectMonitor`s on `delete_list`. - move async deflation work from the `ServiceThread` to a dedicated `MonitorDeflationThread`; this prevents `ObjectMonitor` inflation storms from delaying the work done by the `ServiceThread` for other subsystems; this means the `ServiceThread` no longer wakes up every `GuaranteedSafepointInterval` to check for work. - the `AllocationState` enum is dropped along with the `_allocation_state` field and associated getters and setters; the simpler list management no longer requires the allocation state to be tracked. - the safepoint cleanup phase no longer requests async monitor deflation; there is no longer a safepoint cleanup task for monitor deflation, but there is still an auditing/logging hook for debugging purposes. - delete ObjectSynchronizer functions associated with more complicated list management: `deflate_global_idle_monitors()`, `deflate_per_thread_idle_monitors()`, `deflate_common_idle_monitors()`, `om_flush()`, `prepend_list_to_common()`, `prepend_list_to_global_free_list()`, `prepend_list_to_global_wait_list()`, `prepend_list_to_global_in_use_list()`, `prepend_to_common()`, `prepend_to_om_free_list()`, `prepend_to_om_in_use_list()`, `take_from_start_of_common()`, `take_from_start_of_global_free_list()`, `take_from_start_of_om_free_list()` - delete the spin-lock functions needed by the more complicated list management. - delete a number of audit/debug/logging related functions needed by the more complicated list management. - restore the barrier related code that needed relocation due to om_flush()'s access of the weak obj reference; now that om_flush() is gone, the barrier related code can go back to its more natural place. ### Get Rid of Type-Stable Memory (TSM): - `ObjectMonitor` now subclasses `CHeapObj<mtInternal>`. - the `ObjectMonitor` constructor and destructor are now more normal C++! - delete `ObjectMonitor` functions associated with TSM: `clear()`, `clear_common()`, `object_addr()`, `Recycle()`, and `set_object()`. - delete the version of `set_owner_from()` that support two possible old values since it is no longer needed; we are not recycling deflated `ObjectMonitor`s anymore so there's no longer a possibility of a `NULL` `_owner` value or a `DEFLATER_MARKER` value on the same code path. - delete ObjectSynchronizer functions associated with TSM: `om_alloc()`, `om_release()`, `prepend_block_to_lists()` - simplify ObjectSynchronizer functions related to TSM: `deflate_idle_monitors()`, `deflate_monitor_list()`, `inflate()` ### Change A Displaced Header is Always at Offset 0 - Change `markWord::displaced_mark_helper()` and `markWord::set_displaced_mark_helper()` to no longer assume that the displaced header in a `BasicLock` or `ObjectMonitor` is at offset 0. - ObjectMonitor::header_addr() no longer requires the offset to be zero. ### New Diagnostic Options - `AvgMonitorsPerThreadEstimate` - Used to estimate a variable ceiling based on number of threads for use with `MonitorUsedDeflationThreshold`; default is 1024, 0 is off, range is 0..max_jint. The current count of inflated `ObjectMonitor`s and the ceiling are used to determine whether the in-use ratio is higher than `MonitorUsedDeflationThreshold` (default 90). - `MonitorDeflationMax` - The maximum number of `ObjectMonitor`s to deflate, unlink and delete at one time; default is 1 million; range is 1024..max_jint. ------------- PR: https://git.openjdk.java.net/jdk/pull/642