Re: RFR: 8329103: assert(!thread->in_asgct()) failed during multi-mode profiling
On Wed, 27 Mar 2024 01:02:41 GMT, Andrei Pangin wrote: > This fix makes `AsyncGetCallTrace` reentrant and async-signal-safe. > Reentrancy is required in the cases when two or more profiling engines are > running at the same time, e.g., when CPU and Wall clock profilers work > together and therefore one profiler may interrupt another in the middle of > getting a stack trace. > > Tested with async-profiler: > > java > -agentpath:/path/to/libasyncProfiler.so=start,event=cpu,interval=1ms,wall=1ms,file=profile.jfr Marked as reviewed by sspitsyn (Reviewer). src/hotspot/share/runtime/thread.hpp line 664: > 662: ThreadInAsgct(Thread* thread) : _thread(thread) { > 663: assert(thread != nullptr, "invariant"); > 664: // AsyncGetCallTrace is reentrant - save the previous state. Nit: It is possible to rephrase this comment as follows: // Allow AsyncGetCallTrace to be reentrant - save the previous state. `` - PR Review: https://git.openjdk.org/jdk/pull/18504#pullrequestreview-1967889766 PR Review Comment: https://git.openjdk.org/jdk/pull/18504#discussion_r1544102770
Re: RFR: JDK-8327769: jcmd GC.heap_dump without options should write to location given by -XX:HeapDumpPath, if set [v10]
On Thu, 28 Mar 2024 15:08:27 GMT, Matthias Baesken wrote: > Yes it is some work on documentation (but seems some doc work needs to be > done anyway because it was forgotten when `HeapDumpGzipLevel` was introduced). That currently only impacts 3 documents. Having HeapDumpGzipLevel apply to the jcmd impacts 4 more, and if it doesn't apply to the jcmd then you have an inconsistency with how HeapDumpXXX options are applied. There's also a question of whether currently missing doc updates for HeapDumpGzipLevel should be made part of this PR because it complicates back porting. We need to make sure we don't backport HeapDumpGzipLevel changes to a JDK version that has HeapDumpPath but not HeapDumpGzipLevel. I think as of right now HeapDumpPath is in 11 but HeapDumpGzipLevel is only backported to 17. > And the current name is generic and does not mention the OOM case in the name > itself, the current implementation would better match the name if it was > `HeapDumpPathOnOom` or something like this. There's still the question of whether or not it is even appropriate to have -XX options taking the place of jcmd options. - PR Comment: https://git.openjdk.org/jdk/pull/18190#issuecomment-2026620689
Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v4]
On Fri, 29 Mar 2024 03:50:03 GMT, Chris Plummer wrote: >> test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/libPopFrameTest.cpp >> line 161: >> >>> 159: int attempts = 0; >>> 160: while (!bp_sync_reached) { >>> 161: LOG("Main: ensureAtBreakpoint: waitig 5 millis\n"); >> >> Suggestion: >> >> LOG("Main: ensureAtBreakpoint: waiting 10 millis\n"); >> >> >> I think this should be moved directly above the wait(10) call. > > I think you missed this suggestion. There is a typo in the log message, and > this LOG should be move to just before the wait() statement below. Sorry, missed to do this. Fixed now. - PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1544041108
Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v6]
> This PR fixes a synchronization issue in the test: > `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest` > > The method `notifyAtBreakpoint()` can notify the `TestTask` thread when it > has not reached an expected breakpoint yet. > The fix is to add a call to the method `ensureAtBreakpoint()` one more time > in the `B2` sub-test. It is needed after the top-most frame was popped with > the JVMTI `PopFrame`, and the target thread needs to reach the breakpoint > again after its execution was resumed. > > The time is very intermittent. At least, I was not able to reproduce the > timeout failure in thousands of mach5 runs with the `-Xcomp` option. > > Testing: > - Run the test > `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest` thousands > times in mach5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: move log message to the right place - Changes: - all: https://git.openjdk.org/jdk/pull/18419/files - new: https://git.openjdk.org/jdk/pull/18419/files/14e0e60f..f8344149 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18419=05 - incr: https://webrevs.openjdk.org/?repo=jdk=18419=04-05 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18419.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18419/head:pull/18419 PR: https://git.openjdk.org/jdk/pull/18419
Re: RFR: 8318026: jcmd should provide access to detailed JVM object information [v11]
On Wed, 27 Mar 2024 18:31:50 GMT, Kevin Walls wrote: >> Introduce the jcmd "VM.inspect" to implement access to detailed JVM object >> information. >> >> Not recommended for live production use. Requires UnlockDiagnosticVMOptions >> and not included in jcmd help output, to remind us this is not a >> general-purpose customer-facing tool. > > Kevin Walls has updated the pull request incrementally with one additional > commit since the last revision: > > Test more pointer types: compiled method and metadata. test/hotspot/jtreg/serviceability/dcmd/vm/VMInspectTest.java line 117: > 115: output = executor.execute("VM.inspect -1"); > 116: output.shouldContain("address not safe"); > 117: Nit: Just a suggestion to make the test more readable. Now when more test cases have been added you may want to refactor it to call a separate method for each sub-test. E.g.: `testBaddAddresses()`, `testMisalignedAddress()`, `testCompiledMethodAddress()`, `testMetadataAddress()`, `testClassAddress()`, `testThreadAddress()`, etc. - PR Review Comment: https://git.openjdk.org/jdk/pull/17655#discussion_r1544038490
Re: RFR: JDK-8327769: jcmd GC.heap_dump without options should write to location given by -XX:HeapDumpPath, if set [v10]
On Thu, 28 Mar 2024 15:08:27 GMT, Matthias Baesken wrote: > Wouldn't this just be a case of changing a flag description? As luck has it, > the flag already has a generic name that is not tied to OOMs. One of the issues with this PR is that there are 7 places where some sort of doc/help update is needed. The help output for the `filename` option does not work well, because currently it is required option and has no default. This PR sort of gives it a default, but not always, so the default needs to be given in descriptive way , but jcmd help is not really setup to handle descriptive defaults, so we end up with this: _filename("filename","Name of the dump file", "STRING", false, "if no filename was specified, but -XX:HeapDumpPath=hdp is set, path hdp is taken"), And even this is not complete. HeapDumpPath can be a directory, in which case a filename is generated using the PID and a dump sequence number. Probably all this belongs in the argument description, which currently just says "Name of dump file", but you still need to put something in for the default. If the description is complete, maybe just using "See description" would be better. - PR Comment: https://git.openjdk.org/jdk/pull/18190#issuecomment-2026614466
Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v4]
On Wed, 27 Mar 2024 19:59:28 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one >> additional commit since the last revision: >> >> review: improve diagnostics and reliability > > test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/libPopFrameTest.cpp > line 161: > >> 159: int attempts = 0; >> 160: while (!bp_sync_reached) { >> 161: LOG("Main: ensureAtBreakpoint: waitig 5 millis\n"); > > Suggestion: > > LOG("Main: ensureAtBreakpoint: waiting 10 millis\n"); > > > I think this should be moved directly above the wait(10) call. I think you missed this suggestion. There is a typo in the log message, and this LOG should be move to just before the wait() statement below. - PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1544028878
Re: RFR: JDK-8303689: javac -Xlint could/should report on "dangling" doc comments
On Wed, 27 Mar 2024 22:04:30 GMT, Jonathan Gibbons wrote: > Please review the updates to support a proposed new > `-Xlint:dangling-doc-comments` option. > > The work can be thought of as in 3 parts: > > 1. An update to the `javac` internal class `DeferredLintHandler` so that it > is possible to specify the appropriately configured `Lint` object when it is > time to consider whether to generate the diagnostic or not. > > 2. Updates to the `javac` front end to record "dangling docs comments" found > near the beginning of a declaration, and to report them using an instance of > `DeferredLintHandler`. This allows the warnings to be enabled or disabled > using the standard mechanisms for `-Xlint` and `@SuppressWarnings`. The > procedure for handling dangling doc comments is described in this comment in > `JavacParser`. > > * Dangling documentation comments are handled as follows. > * 1. {@code Scanner} adds all doc comments to a queue of > * recent doc comments. The queue is flushed whenever > * it is known that the recent doc comments should be > * ignored and should not cause any warnings. > * 2. The primary documentation comment is the one obtained > * from the first token of any declaration. > * (using {@code token.getDocComment()}. > * 3. At the end of the "signature" of the declaration > * (that is, before any initialization or body for the > * declaration) any other "recent" comments are saved > * in a map using the primary comment as a key, > * using this method, {@code saveDanglingComments}. > * 4. When the tree node for the declaration is finally > * available, and the primary comment, if any, > * is "attached", (in {@link #attach}) any related > * dangling comments are also attached to the tree node > * by registering them using the {@link #deferredLintHandler}. > * 5. (Later) Warnings may be genereated for the dangling > * comments, subject to the {@code -Xlint} and > * {@code @SuppressWarnings}. > > > 3. Updates to the make files to disable the warnings in modules for which > the > warning is generated. This is often because of the confusing use of `/**` to > create box or other standout comments. lgtm - Marked as reviewed by vromero (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18527#pullrequestreview-1967750057
Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v5]
> This PR fixes a synchronization issue in the test: > `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest` > > The method `notifyAtBreakpoint()` can notify the `TestTask` thread when it > has not reached an expected breakpoint yet. > The fix is to add a call to the method `ensureAtBreakpoint()` one more time > in the `B2` sub-test. It is needed after the top-most frame was popped with > the JVMTI `PopFrame`, and the target thread needs to reach the breakpoint > again after its execution was resumed. > > The time is very intermittent. At least, I was not able to reproduce the > timeout failure in thousands of mach5 runs with the `-Xcomp` option. > > Testing: > - Run the test > `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest` thousands > times in mach5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: pilished native waiting timouts in test - Changes: - all: https://git.openjdk.org/jdk/pull/18419/files - new: https://git.openjdk.org/jdk/pull/18419/files/f4f92e92..14e0e60f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18419=04 - incr: https://webrevs.openjdk.org/?repo=jdk=18419=03-04 Stats: 6 lines in 2 files changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18419.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18419/head:pull/18419 PR: https://git.openjdk.org/jdk/pull/18419
Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v4]
On Wed, 27 Mar 2024 20:27:42 GMT, Daniel D. Daugherty wrote: >> test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/libPopFrameTest.cpp >> line 163: >> >>> 161: LOG("Main: ensureAtBreakpoint: waitig 5 millis\n"); >>> 162: if (++attempts > 100) { >>> 163: fatal(jni, "Main: ensureAtBreakpoint: waited 1 sec"); >> >> 1 second isn't very long when you are talking about something that is >> relying on debugger/debuggee communication. Maybe wait 100ms at a time for a >> total of 10 seconds. > > Caught this comment in passing. Delays like this should be scaled with > defaultTimeoutFactor so that test tasks that invoke tests with options > that can slow the test down, e.g., `-Xcomp`, can be accommodated. > > I believe the defaultTimeoutFact for `-Xcomp` test tasks gets bumped > from 4 to 10. Thanks for the comments, Chris and Dan. Updated as Chris suggested. I've added this with `-Xcomp` consideration as the worst case scenario in mind. Now, I think it is more save to make it 10 seconds instead of one. Is it going to be good enough? In fact, I've added this for manual testing to save time in waiting for test completion when it is deadlocked. - PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1543839536
Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v4]
On Wed, 27 Mar 2024 20:06:54 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one >> additional commit since the last revision: >> >> review: improve diagnostics and reliability > > test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/PopFrameTest.java > line 148: > >> 146: log("Main #B.2: got expected JVMTI_ERROR_NONE"); >> 147: resumeThread(testTaskThread); >> 148: > > You probably don't need this minor edit or the copyright update any more. > > However, it's still unclear to me how the ensureAtBreakpoint() below is > suppose to work if we already called notifyAtBreakpoint() between the 1st and > 2nd ensureAtBreakpoint(). From what I can tell, notifyAtBreakpoint() clears > the flag that ensureAtBreakpoint() checks for, and there is no 2nd breakpoint > to set it again. I would expect the ensureAtBreakpoint() below to always wait > indefinitely, but that doesn't appear to be the case. So how is this working? > I must be missing something. Thanks, Chris. I noticed that some an additional cleanup is needed in the Java part. The `TestTask.run()` has two calls to the method `B()`: public void run() { log("TestTask.run: started"); A(); sleep(1); // to cause yield prepareAgent(TestTask.class, false); // No doPopFrame B(); sleep(1); // to cause yield prepareAgent(TestTask.class, true); // doPopFrame B(); sleep(1); // to cause yield C(); finished = true; } The second `B()` call does the JVMTI `PopFrame` call on its own thread in the `Breakpoint()` event hadler. Please, note, `true` in the `prepareAgent` parameters: `prepareAgent(TestTask.class, true); // doPopFrame` So, the second `ensureAtBreakpoint()` is a sync point with it. > test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/libPopFrameTest.cpp > line 58: > >> 56: LOG("Breakpoint: In method TestTask.B(): after sync section\n"); >> 57: >> 58: if (do_pop_frame != JNI_FALSE) { > > Suggestion: > > if (do_pop_frame) { Okay, thanks. I can do this change. - PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1543825742 PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1543826541
Re: RFR: 8329103: assert(!thread->in_asgct()) failed during multi-mode profiling
On Thu, 28 Mar 2024 13:09:10 GMT, David Holmes wrote: >> This fix makes `AsyncGetCallTrace` reentrant and async-signal-safe. >> Reentrancy is required in the cases when two or more profiling engines are >> running at the same time, e.g., when CPU and Wall clock profilers work >> together and therefore one profiler may interrupt another in the middle of >> getting a stack trace. >> >> Tested with async-profiler: >> >> java >> -agentpath:/path/to/libasyncProfiler.so=start,event=cpu,interval=1ms,wall=1ms,file=profile.jfr > > I have my doubts as to whether AGCT is actually re-entrant in a general > sense, but I can see that the `ThreadInAsgct` RAII object introduced a > reentrancy constraint that did not exist prior, and so removing it should not > make AGCT any less safe and should allow previous reentrancy cases to > continue to work as before. @dholmes-ora Thank you for the review. When looking at the AGCT code, nothing suspicious that could affect reentrancy caught my eye. Also, benchmarks (specJVM, Renaissance) now run fine with two profilers enabled. So yes, while I don't have 100% proof that AGCT is async-signal-safe, the fix definitely improves the situation. - PR Comment: https://git.openjdk.org/jdk/pull/18504#issuecomment-2026255860
RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation
PreserveAllAnnotations option causes class file parser to preserve RuntimeInvisibleAnnotations so VM considers them as RuntimeVisibleAnnotations. For class retransformation JvmtiClassFileReconstituter restores all annotations as RuntimeVisibleAnnotations attributes. This can cause problem is the class contains only RuntimeInvisibleAnnotations, so corresponding RuntimeVisibleAnnotations attribute name is not present in the class constant pool. Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). The suggested fix adds workaround for this corner case - if "visible" attribute name is not in the CP, the annotations are restored with "invisible" attribute name. Testing: - tier1,tier2,hs-tier5-svc - all java/lang/instrument tests; - all RedefineClasses/RetransformClasses tests - Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/18540/files Webrev: https://webrevs.openjdk.org/?repo=jdk=18540=00 Issue: https://bugs.openjdk.org/browse/JDK-8328137 Stats: 33 lines in 3 files changed: 22 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/18540.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18540/head:pull/18540 PR: https://git.openjdk.org/jdk/pull/18540
Re: RFR: 8318026: jcmd should provide access to detailed JVM object information [v9]
On Wed, 27 Mar 2024 18:39:38 GMT, Chris Plummer wrote: > I think we also need to consider the flip side of this argument. Is this > something that some customers might want to always enable, but don't want to > always have UnlockDiagnosticVMOptions enabled. A new command line flag would > be needed in that case. > > Also, isn't UnlockDiagnosticVMOptions meant for enabling the use of > diagnostic command line flags? Do we have examples of it enabling a hotspot > feature that does not also require setting a diagnostic command line flag? Thanks Chris - that is correct now I check the wording, UnlockDiagnosticVMOptions: "Enable normal processing of flags relating to field diagnostics" Yes it makes flags which are DIAGNOSTIC available at the command-line. We have UnlockExperimentalVMOptions also. My original reason for the guard, is that the risk of fooling the "good oop" check is that we could possibly crash (so that's answering your other question). Inspecting an arbitrary pointer that looks enough like a Java heap object to fool the test, means it might try and resolve bad addresses for Klass pointer or field data. I also have a stress test which examines every address in a Java heap to test for crashes. That's obviously long-running as it does a jcmd for every iteration. But I can't currently get a crash, trying G1, ZGC, ZGCGenerational. Having a global guard to prevent usage of VM.inspect may be needed less than I thought, but will wait on the other thread before saying that is really the case. We could have a new -XX flag to guard it (whether specific to this command or a more general UnlockDiagnosticFeatures. I would definitely stick to this being in all builds, as we frequently "debug" non-debug builds. Also, for your comment above about the new version of dbg_is_good_oop: this was added because although there are not many callers of it, I did not want to upset them. During my earlier testing the detailed version of this method did upset something, although I cannot reproduce that today. I will rerun some tests on these items... - PR Comment: https://git.openjdk.org/jdk/pull/17655#issuecomment-2025726278
Re: RFR: 8318026: jcmd should provide access to detailed JVM object information [v9]
On Thu, 28 Mar 2024 14:12:08 GMT, Andrew Dinn wrote: >>> It looks like the test only inspects a thread and a java object. Perhaps >>> you could add tests for additional VM objects. Maybe grab a frame PC from a >>> thread stack trace. >> >> Yes - added a couple of metadata tests, and a compiled method test, which >> gets an address from Events info. We know that should resolve to a compiled >> method (if we have such an event, so this copes if there are none). >> >> >> Also the VM.info and Thread.print runs with the CommandExecutor are now >> silent. They are long, and max out the test log file and make it truncate. >> That output could mainly be useful if the regexes can't match items, but >> the output format should be reproducible in a new run. Also if we fail to >> find something we need, like a thread id, it will print the full output it >> searched. > > @kevinjwalls Hi Kevin. Sorry to offer a drive-by comment but I have been > watching this thread and I can understand why @tstuefe is expressing his > concern about the potential for security issues when it comes to using jcmd > to expose JVM internals. > > Exposure of details like a thread/stack/metadata value address or a datum > embedded at such a location does look to me like something bad agents might > be able to profit from. The danger is not just being able to retrieve > specific details of the layout or content of JVM structures themselves, but > the opportunity to use that knowledge to upgrade a weak security hole like, > say, a memory exposure, into a stronger targeted attack that knows where or > to what it might want to apply the crowbar. > > So, first off, please understand that I am not suggesting there *is* a > problem here. I am happy to accept your conclusion that your proposed jcmd > changes do not expose new data to users who ought not to be able to view > either via local or via remote accesses. However, not being an expert when it > comes to the jcmd/DCmd implementation, I'm not sure I really understand why > that is so from your current explanation. In particular I don't understand > what you said about visibility of different commands for local and remote > access. > > That confession of my own ignorance is not really of any immediate importance > as I have no intention of trying to review this change. However, I still feel > it might be useful if you could summarize on this JIRA precisely what > security safeguards are currently in place when it comes to running specific > jcmds/DCmds and why that means exposure of JVM internal details via jcmd, > whether locally or remotely via JMX, will only be to a safe audience of > users. That would help any maintainer/developer who refers to this JIRA to > follow how those safeguards apply to the proposed commands (in particular it > might be of great help to those performing and assessing the correctness of > backports). However, it would probably also as a prophylactic to ensure that > any future development work does not inadvertently lead to unexpected > exposure, which I think is the thing Thomas is more worried about. Indeed, as > Thomas suggested, a clear statement of what policies and mechanisms are (or > should be) in plac e to ensure jcmd content is securely viewable might be a good starting point for you and/or Thomas to raise follow-up JIRAs if needed to: > 1. add comments to the code base to ensure devs to not inadvertently add or > modify new DCmds ... Hi @adinn Drive-bys are welcome. 8-) Providing new crowbars to attackers is not something we want to do. jcmd is protected by the attach api and is not open to other users. We are trusting in that. If you can satisfy the attach API connection, you have all the jcmds/DiagnosticCommands available. No limits. Likewise you can use other tools to examine abritrary memory in the target process, take a gcore, kill it, etc... Why will VM.inspect not be available remotely? Because this marks the jcmd as "hidden": src/hotspot/share/services/diagnosticCommand.cpp: DCmdFactory::register_DCmdFactory(new DCmdFactoryImpl(full_export, true, true /* hidden */ )); (The new test for VM.inspect checks it does not appear in the jcmd help output, to ensure it stays hidden.) In JMX, DiagnosticCommandMBean(Impl) does not expose hidden commands: The route for that is DiagnosticCommandImpl.java calling its native getDiagnosticCommands(), that gets into management.cpp's jmm_GetDiagnosticCommands() calling DCmdFactory::DCmd_list(). In diagnosticFramework.cpp, this iterates DCmdFactories, choosing those which are not hidden. Querying over JMX I cannot see or access vmInspect, like I can see vmInfo, gcClassHistogram etc... through the exposed DiagnosticCommandMBean. That "hidden" flag could be more obvious or better documented perhaps. Also I don't think we actually say in the DiagnosticCommandMBean API docs that it provides access to the DiagnosticCommand
Re: RFR: JDK-8327769: jcmd GC.heap_dump without options should write to location given by -XX:HeapDumpPath, if set [v10]
On Thu, 28 Mar 2024 07:02:16 GMT, Thomas Stuefe wrote: > Wouldn't this just be a case of changing a flag description? As luck has it, > the flag already has a generic name that is not tied to OOMs. Yes it is some work on documentation (but seems some doc work needs to be done anyway because it was forgotten when `HeapDumpGzipLevel` was introduced). And the current name is generic and does not mention the OOM case in the name itself, the current implementation would better match the name if it was `HeapDumpPathOnOom` or something like this. - PR Comment: https://git.openjdk.org/jdk/pull/18190#issuecomment-2025456286
Re: RFR: 8318026: jcmd should provide access to detailed JVM object information [v9]
On Wed, 27 Mar 2024 18:28:07 GMT, Kevin Walls wrote: >> It looks like the test only inspects a thread and a java object. Perhaps you >> could add tests for additional VM objects. Maybe grab a frame PC from a >> thread stack trace. > >> It looks like the test only inspects a thread and a java object. Perhaps you >> could add tests for additional VM objects. Maybe grab a frame PC from a >> thread stack trace. > > Yes - added a couple of metadata tests, and a compiled method test, which > gets an address from Events info. We know that should resolve to a compiled > method (if we have such an event, so this copes if there are none). > > > Also the VM.info and Thread.print runs with the CommandExecutor are now > silent. They are long, and max out the test log file and make it truncate. > That output could mainly be useful if the regexes can't match items, but the > output format should be reproducible in a new run. Also if we fail to find > something we need, like a thread id, it will print the full output it > searched. @kevinjwalls Hi Kevin. Sorry to offer a drive-by comment but I have been watching this thread and I can understand why @tstuefe is expressing his concern about the potential for security issues when it comes to using jcmd to expose JVM internals. Exposure of details like a thread/stack/metadata value address or a datum embedded at such a location does look to me like something bad agents might be able to profit from. The danger is not just being able to retrieve specific details of the layout or content of JVM structures themselves, but the opportunity to use that knowledge to upgrade a weak security hole like, say, a memory exposure, into a stronger targeted attack that knows where or to what it might want to apply the crowbar. So, first off, please understand that I am not suggesting there *is* a problem here. I am happy to accept your conclusion that your proposed jcmd changes do not expose new data to users who ought not to be able to view either via local or via remote accesses. However, not being an expert when it comes to the jcmd/DCmd implementation, I'm not sure I really understand why that is so from your current explanation. In particular I don't understand what you said about visibility of different commands for local and remote access. That confession of my own ignorance is not really of any immediate importance as I have no intention of trying to review this change. However, I still feel it might be useful if you could summarize on this JIRA precisely what security safeguards are currently in place when it comes to running specific jcmds/DCmds and why that means exposure of JVM internal details via jcmd, whether locally or remotely via JMX, will only be to a safe audience of users. That would help any maintainer/developer who refers to this JIRA to follow how those safeguards apply to the proposed commands (in particular it might be of great help to those performing and assessing the correctness of backports). However, it would probably also as a prophylactic to ensure that any future development work does not inadvertently lead to unexpected exposure, which I think is the thing Thomas is more worried about. Indeed, as Thomas suggested, a clear statement of what policies and mechanisms are (or should be) in place to ensure jcmd content is securely viewable might be a good starting point for you and/or Thomas to raise follow-up JIRAs if needed to: 1. add comments to the code base to ensure devs to not inadvertently add or modify new DCmds in a way that prejudices security 2. enable a review of the existing DCmd organization to see if a better grouping of commands or assignent of access restrictions might be warranted - PR Comment: https://git.openjdk.org/jdk/pull/17655#issuecomment-2025282602
Re: RFR: 8329103: assert(!thread->in_asgct()) failed during multi-mode profiling
On Wed, 27 Mar 2024 01:02:41 GMT, Andrei Pangin wrote: > This fix makes `AsyncGetCallTrace` reentrant and async-signal-safe. > Reentrancy is required in the cases when two or more profiling engines are > running at the same time, e.g., when CPU and Wall clock profilers work > together and therefore one profiler may interrupt another in the middle of > getting a stack trace. > > Tested with async-profiler: > > java > -agentpath:/path/to/libasyncProfiler.so=start,event=cpu,interval=1ms,wall=1ms,file=profile.jfr I have my doubts as to whether AGCT is actually re-entrant in a general sense, but I can see that the `ThreadInAsgct` RAII object introduced a reentrancy constraint that did not exist prior, and so removing it should not make AGCT any less safe and should allow previous reentrancy cases to continue to work as before. - Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18504#pullrequestreview-1966173558
Integrated: 8328619: sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java failed with BindException: Address already in use
On Mon, 25 Mar 2024 13:15:48 GMT, Kevin Walls wrote: > Test uses jdk.test.lib.Utils.getFreePort() when launching a new Java command. > Looks like it already recognises "java.rmi.server.ExportException: Port > already in use: " and retries, but there is a long-standing typo in the check. > > e.g. > > test output: > Error: Exception thrown by the agent: java.rmi.server.ExportException: Port > already in use: 37049; nested exception is: > java.net.BindException: Address already in use > > Test checks for: > !output.getOutput().contains("Exception thrown by the agent : > java.rmi.server.ExportException: Port already in use") > > Oops, we have an extra space in there. A day-one typo from JDK-7195249. > > While here, try to clarify the while loop which recognises this port failure. > Also add something to clarify which test(s) failed, and correct a comment in > test2 of AbstractFilePermissionTest.java This pull request has now been integrated. Changeset: 2af0312c Author:Kevin Walls URL: https://git.openjdk.org/jdk/commit/2af0312c958e693b1377f4c014ae8f84cabf6b83 Stats: 19 lines in 1 file changed: 7 ins; 2 del; 10 mod 8328619: sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java failed with BindException: Address already in use Reviewed-by: lmesnik, dfuchs - PR: https://git.openjdk.org/jdk/pull/18470
Re: RFR: 8328619: sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java failed with BindException: Address already in use [v2]
On Mon, 25 Mar 2024 22:46:47 GMT, Kevin Walls wrote: >> Test uses jdk.test.lib.Utils.getFreePort() when launching a new Java >> command. >> Looks like it already recognises "java.rmi.server.ExportException: Port >> already in use: " and retries, but there is a long-standing typo in the >> check. >> >> e.g. >> >> test output: >> Error: Exception thrown by the agent: java.rmi.server.ExportException: Port >> already in use: 37049; nested exception is: >> java.net.BindException: Address already in use >> >> Test checks for: >> !output.getOutput().contains("Exception thrown by the agent : >> java.rmi.server.ExportException: Port already in use") >> >> Oops, we have an extra space in there. A day-one typo from JDK-7195249. >> >> While here, try to clarify the while loop which recognises this port >> failure. Also add something to clarify which test(s) failed, and correct a >> comment in test2 of AbstractFilePermissionTest.java > > Kevin Walls has updated the pull request incrementally with one additional > commit since the last revision: > > show exit code Thanks Daniel. Yes, the variations of the starting-with-a-free-port problem will be nice to standardize at some point... 8-) - PR Comment: https://git.openjdk.org/jdk/pull/18470#issuecomment-2024969513
Re: RFR: 8328619: sun/management/jmxremote/bootstrap/SSLConfigFilePermissionTest.java failed with BindException: Address already in use [v2]
On Mon, 25 Mar 2024 22:46:47 GMT, Kevin Walls wrote: >> Test uses jdk.test.lib.Utils.getFreePort() when launching a new Java >> command. >> Looks like it already recognises "java.rmi.server.ExportException: Port >> already in use: " and retries, but there is a long-standing typo in the >> check. >> >> e.g. >> >> test output: >> Error: Exception thrown by the agent: java.rmi.server.ExportException: Port >> already in use: 37049; nested exception is: >> java.net.BindException: Address already in use >> >> Test checks for: >> !output.getOutput().contains("Exception thrown by the agent : >> java.rmi.server.ExportException: Port already in use") >> >> Oops, we have an extra space in there. A day-one typo from JDK-7195249. >> >> While here, try to clarify the while loop which recognises this port >> failure. Also add something to clarify which test(s) failed, and correct a >> comment in test2 of AbstractFilePermissionTest.java > > Kevin Walls has updated the pull request incrementally with one additional > commit since the last revision: > > show exit code Looks reasonable. I wish we had a way to get rid of this getFreePort() logic. Marked as reviewed by dfuchs (Reviewer). - PR Review: https://git.openjdk.org/jdk/pull/18470#pullrequestreview-1965847115 PR Review: https://git.openjdk.org/jdk/pull/18470#pullrequestreview-1965848443
Integrated: 8327505: Test com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.java fails
On Tue, 19 Mar 2024 16:23:27 GMT, Kevin Walls wrote: > Client.java has a fixed 30-second timeout on the CountDownLatch to wait for > 10 notifications. > > If it fails, you can't tell if CountDownLatch.await threw, or returned false > and the app threw InterruptedException, due to the way Client.java handles > these. > > Seems most likely the 30 second wait expired, as we are dealing with -Xcomp > failures in a debug build. Passing runs can take a few seconds, but can be > 25 seconds. > > Increasing the timeout and tidying up the handling so we can see the specific > reason in future. This pull request has now been integrated. Changeset: 2b79c22c Author:Kevin Walls URL: https://git.openjdk.org/jdk/commit/2b79c22c43a2de0815e77c9aa71f010906be8670 Stats: 7 lines in 1 file changed: 3 ins; 0 del; 4 mod 8327505: Test com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.java fails Reviewed-by: lmesnik, stuefe - PR: https://git.openjdk.org/jdk/pull/18381
Re: RFR: 8327505: Test com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.java fails
On Tue, 19 Mar 2024 16:23:27 GMT, Kevin Walls wrote: > Client.java has a fixed 30-second timeout on the CountDownLatch to wait for > 10 notifications. > > If it fails, you can't tell if CountDownLatch.await threw, or returned false > and the app threw InterruptedException, due to the way Client.java handles > these. > > Seems most likely the 30 second wait expired, as we are dealing with -Xcomp > failures in a debug build. Passing runs can take a few seconds, but can be > 25 seconds. > > Increasing the timeout and tidying up the handling so we can see the specific > reason in future. Thanks Thomas! - PR Comment: https://git.openjdk.org/jdk/pull/18381#issuecomment-2024715786
Re: RFR: JDK-8327769: jcmd GC.heap_dump without options should write to location given by -XX:HeapDumpPath, if set [v10]
On Wed, 27 Mar 2024 23:33:18 GMT, Chris Plummer wrote: > I did get feedback from a couple of our support engineers. One felt it was of > no real benefit as it was just as easy to provide the > filename as an > argument to the jcmd. That might be true for this support engineer, but not for others with other setups where it is an issue to find out where to write in cloud/container scenarios . - PR Comment: https://git.openjdk.org/jdk/pull/18190#issuecomment-2024704505
Re: RFR: JDK-8327769: jcmd GC.heap_dump without options should write to location given by -XX:HeapDumpPath, if set [v9]
On Wed, 27 Mar 2024 23:26:14 GMT, Chris Plummer wrote: > > Backporting also came to mind since the trouble shooting guide would need to > be updated for each Oracle version this PR is backported to. And yes, > HeapDumpGzipLevel docs need to match the actual implementation. I'm not sure > if HeapDumpGzipLevel has been backported to 8, 11, and 17 (in the open for > for the Oracle release). I'm inclined to say a separate jbs issue should > track updating docs for HeapDumpGzipLevel, but then we also have the question > of whether or not HeapDumpGzipLevel should impact the jcmd if this PR is to > be pushed. HeapDumpGzipLevel is in openjdk17 and openjdk21 . - PR Comment: https://git.openjdk.org/jdk/pull/18190#issuecomment-2024698315
Re: RFR: 8327505: Test com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.java fails
On Tue, 19 Mar 2024 16:23:27 GMT, Kevin Walls wrote: > Client.java has a fixed 30-second timeout on the CountDownLatch to wait for > 10 notifications. > > If it fails, you can't tell if CountDownLatch.await threw, or returned false > and the app threw InterruptedException, due to the way Client.java handles > these. > > Seems most likely the 30 second wait expired, as we are dealing with -Xcomp > failures in a debug build. Passing runs can take a few seconds, but can be > 25 seconds. > > Increasing the timeout and tidying up the handling so we can see the specific > reason in future. Looks good! - Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18381#pullrequestreview-1965321788
Re: RFR: JDK-8327769: jcmd GC.heap_dump without options should write to location given by -XX:HeapDumpPath, if set [v10]
On Wed, 27 Mar 2024 13:44:42 GMT, Matthias Baesken wrote: >> Currently jcmd command GC.heap_dump only works with an additionally provided >> file name. >> Syntax : GC.heap_dump [options] >> >> In case the JVM has the XX - flag HeapDumpPath set, we should support an >> additional mode where the is optional. >> In case the filename is NOT set, we take the HeapDumpPath (file or >> directory); >> >> new syntax : >> GC.heap_dump [options] .. has precedence over second option >> GC.heap_dump [options] …in case -XX: HeapDumpPath=p is set >> >> This would be a simplification e.g. for support cases where a filename or >> directory is set at JVM startup with -XX: HeapDumpPath=p and writing to the >> path is intended/recommended for usage also in the jcmd case. > > Matthias Baesken has updated the pull request incrementally with one > additional commit since the last revision: > > adjust java.1 man page To weight in on the side of this patch, in most customer scenarios I have seen, there is just one location for heap dumps. Using the same location for OOM and for manual heap dumps seems logical to me. @dholmes-ora > Notwithstanding that there may be cases where this change would be > convenient, I really don't like this coupling between the jcmd behaviour and > a -XX flag that is intended for something else. It doesn't completely mesh > with the jcmd usage and other options, and the documentation story is quite > complicated. Wouldn't this just be a case of changing a flag description? As luck has it, the flag already has a generic name that is not tied to OOMs. - PR Comment: https://git.openjdk.org/jdk/pull/18190#issuecomment-2024535044