RFR (XS): 8196450: Deprecate JDWP/JDI canUnrestrictedlyRedefineClasses to match JVM TI capabilities
Please, review a fix for: https://bugs.openjdk.java.net/browse/JDK-8196450 CSR draft (one CSR reviewer is needed before finalizing it): https://bugs.openjdk.java.net/browse/JDK-8246540 Webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jdwp-depr.1/src/ Updated JDWP VirtualMachine::capabilitiesNew spec: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jdwp-depr.1/docs/specs/jdwp/jdwp-protocol.html#JDWP_VirtualMachine_CapabilitiesNew Updated JDI com.sun.jdi.VirtualMachine spec: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jdwp-depr.1/docs/api/jdk.jdi/com/sun/jdi/VirtualMachine.html#canAddMethod() http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jdwp-depr.1/docs/api/jdk.jdi/com/sun/jdi/VirtualMachine.html#canUnrestrictedlyRedefineClasses() Summary: The fix adds annotations and deprecation comments to the capabilities canUnrestrictedlyRedefineClasses and canAddMethod. It impacts the JDWP capabilitiesNew command and the JDI VirtualMachine interface. Testing: Built docs and checked the doc has been generated as expected. Will run the JDI/JDWP tests locally Thanks, Serguei
Re: RFR: 8131745: java/lang/management/ThreadMXBean/AllThreadIds.java still fails intermittently
Hi Alex, Please review a new version of the webrev [1] that no longer uses waitTillEquals() method. [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.04/ [2] https://bugs.openjdk.java.net/browse/JDK-8131745 Thank you, Daniil On 6/3/20, 4:42 PM, "Alex Menkov" wrote: Hi again Daniil, On 06/03/2020 16:31, Daniil Titov wrote: > Hi Alex, > > Thanks for this suggestion. You are right, we actually don't need this waitForAllThreads() method. > > I will include this change in the new version of the webrev. > >> 207 int diff = numNewThreads - numTerminatedThreads; >> 208 long threadCount = mbean.getThreadCount(); >> 209 long expectedThreadCount = prevLiveTestThreadCount + diff; >> 210 if (threadCount < expectedThreadCount) { >> if some internal thread terminates, we'll get failure here > > The failure will not happen. Please note that prevLiveTestThreadCount counts only *test* threads. Thus even if some Internal threads terminated the value mbean.getThreadCount() returns should still be no less than the expected number of live test threads. > > 310 prevLiveTestThreadCount = getTestThreadCount(); Oh, yes, I missed it. LGTM. --alex > > Best regards, > Daniil > > > On 6/3/20, 3:08 PM, "Alex Menkov" wrote: > > Hi Daniil, > > couple notes: > > 198 waitForThreads(numNewThreads, numTerminatedThreads); > > You don't actually need any wait here. > Test cases wait until all threads are in desired state > (checkAllThreadsAlive uses startupCheck, checkDaemonThreadsDead and > checkAllThreadsDead use join()) > > >205 private static void checkLiveThreads(int numNewThreads, >206 int numTerminatedThreads) { >207 int diff = numNewThreads - numTerminatedThreads; >208 long threadCount = mbean.getThreadCount(); >209 long expectedThreadCount = prevLiveTestThreadCount + diff; >210 if (threadCount < expectedThreadCount) { > > if some internal thread terminates, we'll get failure here > > > --alex > > On 06/02/2020 21:00, Daniil Titov wrote: > > Hi Alex, Serguei, and Martin, > > > > Thank you for your comments. Please review a new version of the fix that addresses them, specifically: > > 1) Replaces a double loop in checkAllThreadsAlive() with a code that uses collections and containsAll() method. > > 2) Restores the checks for other ThreadMXBean methods (getThreadCount(), getTotalStartedThreadCount(), getPeakThreadCount()) but with more relaxed conditions. > > 3) Relaxes the check inside checkThreadIds() method > > > > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.03/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > > > Thank you, > > Daniil > > > > On 6/1/20, 5:06 PM, "Alex Menkov" wrote: > > > > Hi Daniil, > > > > 1. before the fix checkLiveThreads() tested > > ThreadMXBean.getThreadCount(), but now as far as I see it tests > > Thread.getAllStackTraces(); > > > > 2. > >237 private static void checkThreadIds() throws InterruptedException { > >238 long[] list = mbean.getAllThreadIds(); > >239 > >240 waitTillEquals( > >241 list.length, > >242 ()->(long)mbean.getThreadCount(), > >243 "Array length returned by " + > >244 "getAllThreadIds() = %1$d not matched count = > > ${provided}", > >245 ()->list.length > >246 ); > >247 } > > > > I suppose purpose of waitTillEquals() is to handle creation/termination > > of VM internal threads. > > But if some internal thread terminates after mbean.getAllThreadIds() and > > before 1st mbean.getThreadCount() call and then VM does not need to > > restart it, waitTillEquals will wait forever. > > > > --alex > > > > > > On 05/29/2020 16:28, Daniil Titov wrote: > > > Hi Alex and Serguei, > > > > > > Please review a new version of the change [1] that makes sure that the test counts > > > only the threads it creates and ignores Internal threads VM might create or des
Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant
Hi Richard, The mach5 test run is good. Thanks, Serguei On 6/2/20 10:57, Reingruber, Richard wrote: Hi Serguei, This looks good to me. Thanks! From an earlier mail: I'm thinking it would be more safe to run full tier5. I guess we're done with reviewing. Would be good if you could run full tier5 now. After that I would like to push. Thanks, Richard. -Original Message- From: serguei.spit...@oracle.com Sent: Dienstag, 2. Juni 2020 18:55 To: Vladimir Kozlov ; Reingruber, Richard ; serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net; hotspot-runtime-...@openjdk.java.net; hotspot-gc-...@openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, This looks good to me. Thanks, Serguei On 5/28/20 09:02, Vladimir Kozlov wrote: Vladimir Ivanov is on break currently. It looks good to me. Thanks, Vladimir K On 5/26/20 7:31 AM, Reingruber, Richard wrote: Hi Vladimir, Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ Not an expert in JVMTI code base, so can't comment on the actual changes. From JIT-compilers perspective it looks good. I put out webrev.1 a while ago [1]: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1/ Webrev(delta): http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.1.inc/ You originally suggested to use a handshake to switch a thread into interpreter mode [2]. I'm using a direct handshake now, because I think it is the best fit. May I ask if webrev.1 still looks good to you from JIT-compilers perspective? Can I list you as (partial) Reviewer? Thanks, Richard. [1] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-April/031245.html [2] http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030340.html -Original Message- From: Vladimir Ivanov Sent: Freitag, 7. Februar 2020 09:19 To: Reingruber, Richard ; serviceability-dev@openjdk.java.net; hotspot-compiler-...@openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ Not an expert in JVMTI code base, so can't comment on the actual changes. From JIT-compilers perspective it looks good. Best regards, Vladimir Ivanov Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 The change avoids making all compiled methods on stack not_entrant when switching a java thread to interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. Thanks, Richard. See also my question if anyone knows a reason for making the compiled methods not_entrant: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html
Re: RFR(S) : 8246494 : introduce vm.flagless at-requires property
Hi David, > So all such tests should be using driver mode, and further the VMs they then > exec don't use any of the APIs that include the jtreg test arguments. correct, and 8151707's subtasks are going to mark only such tests (and tests which should be using driver-mode, but can't due to external factors, remember these follow-up fixes for my use driver-mode? ;) ). there are two more (a bit controversial) use cases where we can consider usage of vm.flagless: - some of debugger-debuggee tests have debugger executed w/ external flags, but don't pass these flags to debuggee; and in most cases, it doesn't seem to be right, so arguable all such tests should be updated to use driver mode to run debugger and then marked w/ vm.flagless. I know that svc team was doing some cleanup in this area recently, and given it's require more investigation w.r.t the tests' intent, I don't plan to do it as a part of 8151707, and instead will create follow up RFEs/tasks. - a unit-like tests which don't ignore flags, but weren't designed to be run w/ external flags; most of jfr tests can be used as an example: you can run w/ any flags, but they might fail as they assert things which happen only in certain configurations and these configurations are written in jtreg test descriptions. currently, these tests are marked w/ jfr k/w and it's advised not to run them w/ any external flags, yet I know that some people successfully do that to test their configurations. given the set of configurations which satisfies needs of jfr tests is much bigger than the configurations listed in the tests, I kinda feel sympathetic to people doing that, on the other hand, it's unsupported and I'd prefer us to express (and enforce) that more clearly. again, given the possible controversiality and need for a broader discussion, I'm planning to file an issue for jfr tests and follow up later w/ interested parties. to sum up, 8151707's subtasks are going to mark *only* obvious and non-controversial cases. for all other cases, the JBS entries are to be filed and followed up on. Cheers, -- Igor > On Jun 3, 2020, at 4:02 PM, David Holmes wrote: > > Hi Igor, > > On 4/06/2020 7:30 am, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >>> 70 lines changed: 66 ins; 0 del; 4 mod >> Hi all, >> could you please review the patch which introduces a new @requires property >> to filter out the tests which ignore externally provided JVM flags? >> the idea behind this patch is to have a way to clearly mark tests which >> ignore flags, so >> a) it's obvious that they don't execute a flag-guarded code/feature, and >> extra care should be taken to use them to verify any flag-guarded changed; >> b) they can be easily excluded from runs w/ flags. > > So all such tests should be using driver mode, and further the VMs they then > exec don't use any of the APIs that include the jtreg test arguments. > > Okay this seems reasonable in what it does. > > Thanks, > David > >> @requires and VMProps allows us to achieve both, so it's been decided to add >> a new property `vm.flagless`. `vm.flagless` is set to false if there are any >> XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` >> (which are known to be set almost always) or any X flags other `-Xmixed`; in >> other words any tests w/ `@requires vm.flagless` will be excluded from runs >> w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare >> cases, when one still wants to run the tests marked by `vm.flagless` w/ >> external flags, `vm.flagless` can be forcefully set to true by setting any >> value to `TEST_VM_FLAGLESS` env. variable. >> this patch adds necessary common changes and marks common tests, namely >> Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests >> will be marked separately by the corresponding subtasks of 8151707[1]. >> please note, the patch depends on CODETOOLS-7902336[2], which will be >> included in the next jtreg version, so this patch is to be integrated only >> after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >> testing: marked tests w/ different XX and X flags w/ and w/o >> TEST_VM_FLAGLESS env. var, and w/o any flags >> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >> Thanks, >> -- Igor
Re: RFR(XS): 8234882: JVM TI StopThread should only allow ThreadDeath
On 6/3/20 16:10, David Holmes wrote: Hi Serguei, On 4/06/2020 6:41 am, serguei.spit...@oracle.com wrote: Hi David, The JetBrains confirmed: Ability to select the exception is a valuable feature they provide. Throwing only ThreadDeath is almost useless. So, should I close this and related JDI/JDWP enhancements as WNF? Yes. Sorry about the wasted work here. No problem, David. Thanks! Serguei Thanks, David - Thanks, Serguei On 6/1/20 08:30, serguei.spit...@oracle.com wrote: Hi David, I'll check with JetBrains on this. Thank you to Dan and you for raising this concern. The JetBrains use case you posted in the CSR looks like valid and useful. Thanks, Serguei On 6/1/20 00:46, David Holmes wrote: Hi Serguei, Sorry, I think we have to re-think this change. As Dan flags in the CSR request debuggers directly expose this API as part of the debugger interface, so any change here will directly impact those tools. At a minimum I think we would need to consult with the tool developers about the impact of making this change, as well as whether it makes any practical difference in the sense that there may be other (less convenient but still available) mechanisms to achieve the same goal in a debugger or agent. David On 31/05/2020 5:50 pm, serguei.spit...@oracle.com wrote: Hi David, Also jumping to end. On 5/30/20 06:50, David Holmes wrote: Hi Serguei, Jumping to the end for now ... On 30/05/2020 5:50 am, serguei.spit...@oracle.com wrote: Hi David and reviewers, The updated webrev version is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.2/src/ This update adds testing that StopThread can return JVMTI_ERROR_INVALID_OBJECT error code. The JVM TI StopThread spec is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.2/docs/specs/jvmti.html#StopThread There is a couple of comments below. On 5/29/20 06:18, David Holmes wrote: On 29/05/2020 6:24 pm, serguei.spit...@oracle.com wrote: On 5/29/20 00:56, serguei.spit...@oracle.com wrote: On 5/29/20 00:42, serguei.spit...@oracle.com wrote: Hi David, Thank you for reviewing this! On 5/28/20 23:57, David Holmes wrote: Hi Serguei, On 28/05/2020 3:12 pm, serguei.spit...@oracle.com wrote: Hi David, I've updated the CSR and webrev in place. The changes are: - addressed David's suggestion to rephrase StopThread description change - replaced JVMTI_ERROR_INVALID_OBJECT with JVMTI_ERROR_ILLEGAL_ARGUMENT - updated the implementation in jvmtiEnv.cpp to return JVMTI_ERROR_ILLEGAL_ARGUMENT - updated one of the nsk.jvmti StopThread tests to check error case with the JVMTI_ERROR_ILLEGAL_ARGUMENT I'm reposting the links for convenience. Enhancement: https://bugs.openjdk.java.net/browse/JDK-8234882 CSR draft: https://bugs.openjdk.java.net/browse/JDK-8245853 Spec updates are good - thanks. Thank you for the CSR review. Webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/src/ src/hotspot/share/prims/jvmtiEnv.cpp The ThreadDeath check is fine but I'm a bit confused about the additional null check that leads to JVMTI_ERROR_INVALID_OBJECT. I can't see how resolve_external_guard can return NULL when not passed in NULL. Nor why that would result in JVMTI_ERROR_INVALID_OBJECT rather than JVMTI_ERROR_NULL_POINTER. And I note JVMTI_ERROR_NULL_POINTER is not even a listed error for StopThread! This part of the change seems unrelated to this issue. I was also surprised with the JVMTI_ERROR_NULL_POINTER and JVMTI_ERROR_INVALID_OBJECT error codes. The JVM TI spec automatic generation adds these two error codes for a jobject parameter. Also, they both are from the Universal Errors section: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/docs/specs/jvmti.html#universal-error You can find a link to this section at the start of the Error section: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/docs/specs/jvmti.html#StopThread My understanding (not sure, it is right) is that NULL has to be reported with JVMTI_ERROR_NULL_POINTER and a bad jobject (for instance, a WeakReference with a GC-ed target) has to be reported with JVMTI_ERROR_INVALID_OBJECT. At least, I was not able to construct a test case to get this error code returned. So, I'm puzzled with this. I'll try to find some examples with JVMTI_ERROR_NULL_POINTER errors. Found the explanation. The JDI file: src/jdk.jdi/share/classes/com/sun/tools/jdi/JDWPException.java has a fragment that translate the INVALID_OBJECT error to the ObjectCollectedException: RuntimeException toJDIException() { switch (errorCode) { case JDWP.Error.INVALID_OBJECT: return new ObjectCollectedException(); So, the INVALID_OBJECT is for a jobject handle that is referencing a collected object. It means that previous implementation incorrectly returned JVMTI_ERROR_NULL_POINTER error code. I should create
Re: RFR: 8131745: java/lang/management/ThreadMXBean/AllThreadIds.java still fails intermittently
Hi again Daniil, On 06/03/2020 16:31, Daniil Titov wrote: Hi Alex, Thanks for this suggestion. You are right, we actually don't need this waitForAllThreads() method. I will include this change in the new version of the webrev. 207 int diff = numNewThreads - numTerminatedThreads; 208 long threadCount = mbean.getThreadCount(); 209 long expectedThreadCount = prevLiveTestThreadCount + diff; 210 if (threadCount < expectedThreadCount) { if some internal thread terminates, we'll get failure here The failure will not happen. Please note that prevLiveTestThreadCount counts only *test* threads. Thus even if some Internal threads terminated the value mbean.getThreadCount() returns should still be no less than the expected number of live test threads. 310 prevLiveTestThreadCount = getTestThreadCount(); Oh, yes, I missed it. LGTM. --alex Best regards, Daniil On 6/3/20, 3:08 PM, "Alex Menkov" wrote: Hi Daniil, couple notes: 198 waitForThreads(numNewThreads, numTerminatedThreads); You don't actually need any wait here. Test cases wait until all threads are in desired state (checkAllThreadsAlive uses startupCheck, checkDaemonThreadsDead and checkAllThreadsDead use join()) 205 private static void checkLiveThreads(int numNewThreads, 206 int numTerminatedThreads) { 207 int diff = numNewThreads - numTerminatedThreads; 208 long threadCount = mbean.getThreadCount(); 209 long expectedThreadCount = prevLiveTestThreadCount + diff; 210 if (threadCount < expectedThreadCount) { if some internal thread terminates, we'll get failure here --alex On 06/02/2020 21:00, Daniil Titov wrote: > Hi Alex, Serguei, and Martin, > > Thank you for your comments. Please review a new version of the fix that addresses them, specifically: > 1) Replaces a double loop in checkAllThreadsAlive() with a code that uses collections and containsAll() method. > 2) Restores the checks for other ThreadMXBean methods (getThreadCount(), getTotalStartedThreadCount(), getPeakThreadCount()) but with more relaxed conditions. > 3) Relaxes the check inside checkThreadIds() method > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.03/ > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > Thank you, > Daniil > > On 6/1/20, 5:06 PM, "Alex Menkov" wrote: > > Hi Daniil, > > 1. before the fix checkLiveThreads() tested > ThreadMXBean.getThreadCount(), but now as far as I see it tests > Thread.getAllStackTraces(); > > 2. >237 private static void checkThreadIds() throws InterruptedException { >238 long[] list = mbean.getAllThreadIds(); >239 >240 waitTillEquals( >241 list.length, >242 ()->(long)mbean.getThreadCount(), >243 "Array length returned by " + >244 "getAllThreadIds() = %1$d not matched count = > ${provided}", >245 ()->list.length >246 ); >247 } > > I suppose purpose of waitTillEquals() is to handle creation/termination > of VM internal threads. > But if some internal thread terminates after mbean.getAllThreadIds() and > before 1st mbean.getThreadCount() call and then VM does not need to > restart it, waitTillEquals will wait forever. > > --alex > > > On 05/29/2020 16:28, Daniil Titov wrote: > > Hi Alex and Serguei, > > > > Please review a new version of the change [1] that makes sure that the test counts > > only the threads it creates and ignores Internal threads VM might create or destroy. > > > > Testing: Running this test in Mach5 with Graal on several hundred times , > > tier1-tier3 tests are in progress. > > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.02/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > > > Thank you, > > Daniil > > > > On 5/22/20, 10:26 AM, "Alex Menkov" wrote: > > > > Hi Daniil, > > > > I'm not sure all this retry logic is a good way. > > As mentioned in jira the most important part of the testing is ensuring > > that you find all the created threads when they are alive, and you don't > > find them when they are dead. The actual thread count checking is not > > that important.
Re: RFR: 8131745: java/lang/management/ThreadMXBean/AllThreadIds.java still fails intermittently
Hi Alex, Thanks for this suggestion. You are right, we actually don't need this waitForAllThreads() method. I will include this change in the new version of the webrev. > 207 int diff = numNewThreads - numTerminatedThreads; > 208 long threadCount = mbean.getThreadCount(); > 209 long expectedThreadCount = prevLiveTestThreadCount + diff; > 210 if (threadCount < expectedThreadCount) { >if some internal thread terminates, we'll get failure here The failure will not happen. Please note that prevLiveTestThreadCount counts only *test* threads. Thus even if some Internal threads terminated the value mbean.getThreadCount() returns should still be no less than the expected number of live test threads. 310 prevLiveTestThreadCount = getTestThreadCount(); Best regards, Daniil On 6/3/20, 3:08 PM, "Alex Menkov" wrote: Hi Daniil, couple notes: 198 waitForThreads(numNewThreads, numTerminatedThreads); You don't actually need any wait here. Test cases wait until all threads are in desired state (checkAllThreadsAlive uses startupCheck, checkDaemonThreadsDead and checkAllThreadsDead use join()) 205 private static void checkLiveThreads(int numNewThreads, 206 int numTerminatedThreads) { 207 int diff = numNewThreads - numTerminatedThreads; 208 long threadCount = mbean.getThreadCount(); 209 long expectedThreadCount = prevLiveTestThreadCount + diff; 210 if (threadCount < expectedThreadCount) { if some internal thread terminates, we'll get failure here --alex On 06/02/2020 21:00, Daniil Titov wrote: > Hi Alex, Serguei, and Martin, > > Thank you for your comments. Please review a new version of the fix that addresses them, specifically: > 1) Replaces a double loop in checkAllThreadsAlive() with a code that uses collections and containsAll() method. > 2) Restores the checks for other ThreadMXBean methods (getThreadCount(), getTotalStartedThreadCount(), getPeakThreadCount()) but with more relaxed conditions. > 3) Relaxes the check inside checkThreadIds() method > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.03/ > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > Thank you, > Daniil > > On 6/1/20, 5:06 PM, "Alex Menkov" wrote: > > Hi Daniil, > > 1. before the fix checkLiveThreads() tested > ThreadMXBean.getThreadCount(), but now as far as I see it tests > Thread.getAllStackTraces(); > > 2. >237 private static void checkThreadIds() throws InterruptedException { >238 long[] list = mbean.getAllThreadIds(); >239 >240 waitTillEquals( >241 list.length, >242 ()->(long)mbean.getThreadCount(), >243 "Array length returned by " + >244 "getAllThreadIds() = %1$d not matched count = > ${provided}", >245 ()->list.length >246 ); >247 } > > I suppose purpose of waitTillEquals() is to handle creation/termination > of VM internal threads. > But if some internal thread terminates after mbean.getAllThreadIds() and > before 1st mbean.getThreadCount() call and then VM does not need to > restart it, waitTillEquals will wait forever. > > --alex > > > On 05/29/2020 16:28, Daniil Titov wrote: > > Hi Alex and Serguei, > > > > Please review a new version of the change [1] that makes sure that the test counts > > only the threads it creates and ignores Internal threads VM might create or destroy. > > > > Testing: Running this test in Mach5 with Graal on several hundred times , > > tier1-tier3 tests are in progress. > > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.02/ > > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > > > Thank you, > > Daniil > > > > On 5/22/20, 10:26 AM, "Alex Menkov" wrote: > > > > Hi Daniil, > > > > I'm not sure all this retry logic is a good way. > > As mentioned in jira the most important part of the testing is ensuring > > that you find all the created threads when they are alive, and you don't > > find them when they are dead. The actual thread count checking is not > > that important. > > I agree with this and I'd just simplify the test by removing checks for > > thread count. VM may create and destroy internal threads when
Re: RFR(XS): 8234882: JVM TI StopThread should only allow ThreadDeath
Hi Serguei, On 4/06/2020 6:41 am, serguei.spit...@oracle.com wrote: Hi David, The JetBrains confirmed: Ability to select the exception is a valuable feature they provide. Throwing only ThreadDeath is almost useless. So, should I close this and related JDI/JDWP enhancements as WNF? Yes. Sorry about the wasted work here. Thanks, David - Thanks, Serguei On 6/1/20 08:30, serguei.spit...@oracle.com wrote: Hi David, I'll check with JetBrains on this. Thank you to Dan and you for raising this concern. The JetBrains use case you posted in the CSR looks like valid and useful. Thanks, Serguei On 6/1/20 00:46, David Holmes wrote: Hi Serguei, Sorry, I think we have to re-think this change. As Dan flags in the CSR request debuggers directly expose this API as part of the debugger interface, so any change here will directly impact those tools. At a minimum I think we would need to consult with the tool developers about the impact of making this change, as well as whether it makes any practical difference in the sense that there may be other (less convenient but still available) mechanisms to achieve the same goal in a debugger or agent. David On 31/05/2020 5:50 pm, serguei.spit...@oracle.com wrote: Hi David, Also jumping to end. On 5/30/20 06:50, David Holmes wrote: Hi Serguei, Jumping to the end for now ... On 30/05/2020 5:50 am, serguei.spit...@oracle.com wrote: Hi David and reviewers, The updated webrev version is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.2/src/ This update adds testing that StopThread can return JVMTI_ERROR_INVALID_OBJECT error code. The JVM TI StopThread spec is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.2/docs/specs/jvmti.html#StopThread There is a couple of comments below. On 5/29/20 06:18, David Holmes wrote: On 29/05/2020 6:24 pm, serguei.spit...@oracle.com wrote: On 5/29/20 00:56, serguei.spit...@oracle.com wrote: On 5/29/20 00:42, serguei.spit...@oracle.com wrote: Hi David, Thank you for reviewing this! On 5/28/20 23:57, David Holmes wrote: Hi Serguei, On 28/05/2020 3:12 pm, serguei.spit...@oracle.com wrote: Hi David, I've updated the CSR and webrev in place. The changes are: - addressed David's suggestion to rephrase StopThread description change - replaced JVMTI_ERROR_INVALID_OBJECT with JVMTI_ERROR_ILLEGAL_ARGUMENT - updated the implementation in jvmtiEnv.cpp to return JVMTI_ERROR_ILLEGAL_ARGUMENT - updated one of the nsk.jvmti StopThread tests to check error case with the JVMTI_ERROR_ILLEGAL_ARGUMENT I'm reposting the links for convenience. Enhancement: https://bugs.openjdk.java.net/browse/JDK-8234882 CSR draft: https://bugs.openjdk.java.net/browse/JDK-8245853 Spec updates are good - thanks. Thank you for the CSR review. Webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/src/ src/hotspot/share/prims/jvmtiEnv.cpp The ThreadDeath check is fine but I'm a bit confused about the additional null check that leads to JVMTI_ERROR_INVALID_OBJECT. I can't see how resolve_external_guard can return NULL when not passed in NULL. Nor why that would result in JVMTI_ERROR_INVALID_OBJECT rather than JVMTI_ERROR_NULL_POINTER. And I note JVMTI_ERROR_NULL_POINTER is not even a listed error for StopThread! This part of the change seems unrelated to this issue. I was also surprised with the JVMTI_ERROR_NULL_POINTER and JVMTI_ERROR_INVALID_OBJECT error codes. The JVM TI spec automatic generation adds these two error codes for a jobject parameter. Also, they both are from the Universal Errors section: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/docs/specs/jvmti.html#universal-error You can find a link to this section at the start of the Error section: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/docs/specs/jvmti.html#StopThread My understanding (not sure, it is right) is that NULL has to be reported with JVMTI_ERROR_NULL_POINTER and a bad jobject (for instance, a WeakReference with a GC-ed target) has to be reported with JVMTI_ERROR_INVALID_OBJECT. At least, I was not able to construct a test case to get this error code returned. So, I'm puzzled with this. I'll try to find some examples with JVMTI_ERROR_NULL_POINTER errors. Found the explanation. The JDI file: src/jdk.jdi/share/classes/com/sun/tools/jdi/JDWPException.java has a fragment that translate the INVALID_OBJECT error to the ObjectCollectedException: RuntimeException toJDIException() { switch (errorCode) { case JDWP.Error.INVALID_OBJECT: return new ObjectCollectedException(); So, the INVALID_OBJECT is for a jobject handle that is referencing a collected object. It means that previous implementation incorrectly returned JVMTI_ERROR_NULL_POINTER error code. I should create and delete local or global ref to construct a test case for this. Interesting th
Re: RFR(XS): 8245126 Kitchensink fails with: assert(!method->is_old()) failed: Should not be installing old methods
Hi Dean, Thank you a lot for the review! I hope, Christian will have a chance to look at it. Thanks, Serguei On 6/3/20 14:56, Dean Long wrote: Hi Serguei, I like the latest changes so that JVMCI matches C2. Please get another review because this is not a trivial change. dl On 6/3/20 10:06 AM, serguei.spit...@oracle.com wrote: Hi Dean, The updated webrev is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.3/ Probably, the JVMCI part can be simplified. Only the compile_state line has to be moved up: +JVMCICompileState compile_state(task); // Skip redefined methods -if (target_handle->is_old()) { +if (compile_state.target_method_is_old()) { failure_reason = "redefined method"; retry_message = "not retryable"; compilable = ciEnv::MethodCompilable_never; } else { - JVMCICompileState compile_state(task); Fixes in the jvmciEnv.?pp are not really needed Please, let me know what do you think. This version does not fail at all (in 300 runs for both C2 and JVMCI). It seems, other two issues disappeared as well: This was seen with the C2: https://bugs.openjdk.java.net/browse/JDK-8245128 This was seen with the JVMCI: https://bugs.openjdk.java.net/browse/JDK-8245446 Thanks, Serguei On 6/1/20 23:40, serguei.spit...@oracle.com wrote: Hi Dean, Thank you for the reply. The problem is I do not fully understand your suggestion, especially the part about caching the method,is_old() value in the cache_jvmti_state(). This is a preliminary webrev where I tried to implement your suggestion: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.2/ This variant is failing in half of test runs for both C1/C2 and JVMCI. I think, the root cause is a safepoint in a ThreadInVMfromNative desctructor. Here: 232 void ciEnv::cache_jvmti_state() { 233 VM_ENTRY_MARK; Then we check for the target_method_is_old() value which is not up-to-date any more. I feel, it was correct and more simple before introducing this approach. Probably, I'm missing something here. I also have a question about the update fragment: 1696 { 1697 // Must switch to native to allocate ci_env 1698 ThreadToNativeFromVM ttn(thread); 1699 ciEnv ci_env((CompileTask*)NULL); 1700 1701 // Switch back to VM state to do compiler initialization 1702 ThreadInVMfromNative tv(thread); 1703 ResetNoHandleMark rnhm; 1704 1705 // Perform per-thread and global initializations 1706 comp->initialize(); 1707 } Can we remove the ciEnv object initialization above with the state transitions? Or it has some side effects? Please, let me know what you think. Thanks, Serguei On 6/1/20 15:10, Dean Long wrote: On 5/31/20 11:16 PM, serguei.spit...@oracle.com wrote: Hi Dean, To check the is_old as you suggest the target method has to be passed to the cache_jvmti_state() as argument. Is it what you are suggesting? I believe you can use use _task->method()->is_old(), as the ciEnv already has the task. Just want to make sure I understand you correctly. The cache_jvmti_state() and cache_dtrace_flags() are called in the CompileBroker::init_compiler_runtime() for a ciEnv with the NULL CompileTask which looks unnecessary (or I don't understand it): bool CompileBroker::init_compiler_runtime() { CompilerThread* thread = CompilerThread::current(); . . . ciEnv ci_env((CompileTask*)NULL); // Cache Jvmti state ci_env.cache_jvmti_state(); // Cache DTrace flags ci_env.cache_dtrace_flags();
Re: RFR(S) : 8246494 : introduce vm.flagless at-requires property
Hi Igor, On 4/06/2020 7:30 am, Igor Ignatyev wrote: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 70 lines changed: 66 ins; 0 del; 4 mod Hi all, could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? the idea behind this patch is to have a way to clearly mark tests which ignore flags, so a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; b) they can be easily excluded from runs w/ flags. So all such tests should be using driver mode, and further the VMs they then exec don't use any of the APIs that include the jtreg test arguments. Okay this seems reasonable in what it does. Thanks, David @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags [1] https://bugs.openjdk.java.net/browse/JDK-8151707 [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 [3] https://bugs.openjdk.java.net/browse/JDK-8246387 Thanks, -- Igor
Re: RFR: 8131745: java/lang/management/ThreadMXBean/AllThreadIds.java still fails intermittently
Hi Daniil, couple notes: 198 waitForThreads(numNewThreads, numTerminatedThreads); You don't actually need any wait here. Test cases wait until all threads are in desired state (checkAllThreadsAlive uses startupCheck, checkDaemonThreadsDead and checkAllThreadsDead use join()) 205 private static void checkLiveThreads(int numNewThreads, 206 int numTerminatedThreads) { 207 int diff = numNewThreads - numTerminatedThreads; 208 long threadCount = mbean.getThreadCount(); 209 long expectedThreadCount = prevLiveTestThreadCount + diff; 210 if (threadCount < expectedThreadCount) { if some internal thread terminates, we'll get failure here --alex On 06/02/2020 21:00, Daniil Titov wrote: Hi Alex, Serguei, and Martin, Thank you for your comments. Please review a new version of the fix that addresses them, specifically: 1) Replaces a double loop in checkAllThreadsAlive() with a code that uses collections and containsAll() method. 2) Restores the checks for other ThreadMXBean methods (getThreadCount(), getTotalStartedThreadCount(), getPeakThreadCount()) but with more relaxed conditions. 3) Relaxes the check inside checkThreadIds() method [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.03/ [2] https://bugs.openjdk.java.net/browse/JDK-8131745 Thank you, Daniil On 6/1/20, 5:06 PM, "Alex Menkov" wrote: Hi Daniil, 1. before the fix checkLiveThreads() tested ThreadMXBean.getThreadCount(), but now as far as I see it tests Thread.getAllStackTraces(); 2. 237 private static void checkThreadIds() throws InterruptedException { 238 long[] list = mbean.getAllThreadIds(); 239 240 waitTillEquals( 241 list.length, 242 ()->(long)mbean.getThreadCount(), 243 "Array length returned by " + 244 "getAllThreadIds() = %1$d not matched count = ${provided}", 245 ()->list.length 246 ); 247 } I suppose purpose of waitTillEquals() is to handle creation/termination of VM internal threads. But if some internal thread terminates after mbean.getAllThreadIds() and before 1st mbean.getThreadCount() call and then VM does not need to restart it, waitTillEquals will wait forever. --alex On 05/29/2020 16:28, Daniil Titov wrote: > Hi Alex and Serguei, > > Please review a new version of the change [1] that makes sure that the test counts > only the threads it creates and ignores Internal threads VM might create or destroy. > > Testing: Running this test in Mach5 with Graal on several hundred times , > tier1-tier3 tests are in progress. > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.02/ > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > Thank you, > Daniil > > On 5/22/20, 10:26 AM, "Alex Menkov" wrote: > > Hi Daniil, > > I'm not sure all this retry logic is a good way. > As mentioned in jira the most important part of the testing is ensuring > that you find all the created threads when they are alive, and you don't > find them when they are dead. The actual thread count checking is not > that important. > I agree with this and I'd just simplify the test by removing checks for > thread count. VM may create and destroy internal threads when it needs it. > > --alex > > On 05/18/2020 10:31, Daniil Titov wrote: > > Please review the change [1] that fixes an intermittent failure of the test. > > > > This test creates and destroys a given number of daemon/user threads and validates the count of those started/stopped threads against values returned from ThreadMXBean thread counts. The problem here is that if some internal threads is started ( e.g. " HotSpotGraalManagement Bean Registration"), or destroyed (e.g. "JVMCI CompilerThread ") the test hangs waiting for expected number of live threads. > > > > The fix limits the time the test is waiting for desired number of live threads and in case if this limit is exceeded the test repeats itself. > > > > Testing. Test with Graal on and Mach5 tier1-tier7 test passed. > > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.01 > > [2] https://bugs.openjdk.java.net/browse/JDK-8131745 > > > > Thank you, > > Daniil > > > > > > > >
Re: RFR(XS): 8245126 Kitchensink fails with: assert(!method->is_old()) failed: Should not be installing old methods
Hi Serguei, I like the latest changes so that JVMCI matches C2. Please get another review because this is not a trivial change. dl On 6/3/20 10:06 AM, serguei.spit...@oracle.com wrote: Hi Dean, The updated webrev is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.3/ Probably, the JVMCI part can be simplified. Only the compile_state line has to be moved up: + JVMCICompileState compile_state(task); // Skip redefined methods - if (target_handle->is_old()) { + if (compile_state.target_method_is_old()) { failure_reason = "redefined method"; retry_message = "not retryable"; compilable = ciEnv::MethodCompilable_never; } else { - JVMCICompileState compile_state(task); Fixes in the jvmciEnv.?pp are not really needed Please, let me know what do you think. This version does not fail at all (in 300 runs for both C2 and JVMCI). It seems, other two issues disappeared as well: This was seen with the C2: https://bugs.openjdk.java.net/browse/JDK-8245128 This was seen with the JVMCI: https://bugs.openjdk.java.net/browse/JDK-8245446 Thanks, Serguei On 6/1/20 23:40, serguei.spit...@oracle.com wrote: Hi Dean, Thank you for the reply. The problem is I do not fully understand your suggestion, especially the part about caching the method,is_old() value in the cache_jvmti_state(). This is a preliminary webrev where I tried to implement your suggestion: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.2/ This variant is failing in half of test runs for both C1/C2 and JVMCI. I think, the root cause is a safepoint in a ThreadInVMfromNative desctructor. Here: 232 void ciEnv::cache_jvmti_state() { 233 VM_ENTRY_MARK; Then we check for the target_method_is_old() value which is not up-to-date any more. I feel, it was correct and more simple before introducing this approach. Probably, I'm missing something here. I also have a question about the update fragment: 1696 { 1697 // Must switch to native to allocate ci_env 1698 ThreadToNativeFromVM ttn(thread); 1699 ciEnv ci_env((CompileTask*)NULL); 1700 1701 // Switch back to VM state to do compiler initialization 1702 ThreadInVMfromNative tv(thread); 1703 ResetNoHandleMark rnhm; 1704 1705 // Perform per-thread and global initializations 1706 comp->initialize(); 1707 } Can we remove the ciEnv object initialization above with the state transitions? Or it has some side effects? Please, let me know what you think. Thanks, Serguei On 6/1/20 15:10, Dean Long wrote: On 5/31/20 11:16 PM, serguei.spit...@oracle.com wrote: Hi Dean, To check the is_old as you suggest the target method has to be passed to the cache_jvmti_state() as argument. Is it what you are suggesting? I believe you can use use _task->method()->is_old(), as the ciEnv already has the task. Just want to make sure I understand you correctly. The cache_jvmti_state() and cache_dtrace_flags() are called in the CompileBroker::init_compiler_runtime() for a ciEnv with the NULL CompileTask which looks unnecessary (or I don't understand it): bool CompileBroker::init_compiler_runtime() { CompilerThread* thread = CompilerThread::current(); . . . ciEnv ci_env((CompileTask*)NULL); // Cache Jvmti state ci_env.cache_jvmti_state(); // Cache DTrace flags ci_env.cache_dtrace_flags(); These calls look unnecessary to me, as the ci_env will cache these again before compiling a method. I suggest removing these calls. We should make sure the cache fields are initialized to sane values in the ciEnv ctor. The JVMCI has a separate implementation for ciEnv which is jvmciEnv and its own set of cache_jvmti_state() and jvmti_state_changed() functions. Both are not called in the JVMCI case. So, these checks look as broken in JVMCI now. JVMCI is in better shape, because it doesn't transition out of _thread_in_vm state, but yes it needs similar changes. Not sure, I have enough compiler knowledge to fix this at this stage of release. Would it better to file a separate hotspot/compiler RFE targeted to 16? It can be assigned to me if it helps. This is a P3 so I believe we have time to fix it for 15. Please go ahead and let's see if we can get it in. I can help with the JVMCI changes if they are not straightforward. dl Thanks, Serguei On 5/28/20 10:54, Dean Long wrote: Sure, you could just have cache_jvmti_state() return a boolean to bail out immediately for is_old. dl On 5/28/20 7:23 AM, serguei.spit...@oracle.com wrote: Hi Dean, Thank you for looking at this! Okay. Let me check what cab be done in this direction. There is no point to cache is_old. The compilation has to bail out if it is discovered to be true. Thanks, Serguei On 5/28/20 00:59, Dean Long wrote: This seems OK as long as the memory barriers in the thread state transitions prevent the C++ compiler from doing something like reading is_old before reading redefinition_count. I would feel bett
RFR(S) : 8246494 : introduce vm.flagless at-requires property
http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 > 70 lines changed: 66 ins; 0 del; 4 mod Hi all, could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? the idea behind this patch is to have a way to clearly mark tests which ignore flags, so a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; b) they can be easily excluded from runs w/ flags. @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags [1] https://bugs.openjdk.java.net/browse/JDK-8151707 [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 [3] https://bugs.openjdk.java.net/browse/JDK-8246387 Thanks, -- Igor
Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently
Hi Chris, > Do you think 60 seconds is a bit long? Isn't the expectation that the join > should happen almost immediately or not at all? In case if an exception is thrown in the try block after the thread is started and before it is moved in the terminated state the join never happens at all. And in other cases the join should happen immediately. I agree that 60 seconds look as a bit long but I just wanted to minimize the odds that that we miss the log and the root cause if the issue is reproduced on some slow machine with such VM options as, say, -Xcomp. >I don't think you need a separate bug, but you should document in the bug what >currently can and can't be reproduce and what is being fixed. I will update the bug with this information. Best regards, Daniil From: Chris Plummer Date: Wednesday, June 3, 2020 at 12:46 PM To: Daniil Titov , David Holmes , serviceability-dev Subject: Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently Hi Daniil, Ok, I misread getLog() and see that is now always returns the log. Do you think 60 seconds is a bit long? Isn't the expectation that the join should happen almost immediately or not at all? I don't think you need a separate bug, but you should document in the bug what currently can and can't be reproduce and what is being fixed. thanks, Chris On 6/3/20 12:08 PM, Daniil Titov wrote: Hi Chris, I was not able to reproduce the original issue anymore in Mach5. However, the test itself has a potential for a deadlock (that was also reported) and in the proposed change we fix it. The log still should be printed and the expectation is that we will be able to see the underlying problem in it if it ever reproduced. I could create a separate bug ( not sure if the subtask is a good fit here since the change fixes some problem in the test ) and close the current one as not reproducible if you think it is a better approach. Regarding Thread.suspend() and Thread.resume() methods the test also checks the thread state after these methods are invoked and since these deprecated methods are still in API I don’t think we should exclude them from being tested. Best regards, Daniil From: Chris Plummer Date: Wednesday, June 3, 2020 at 11:40 AM To: David Holmes , Daniil Titov , serviceability-dev Subject: Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently On 6/1/20 12:10 AM, David Holmes wrote: Hi Daniil, On 30/05/2020 10:07 am, Daniil Titov wrote: Please review a change [1] that fixes an intermittent test timeout. The main logic of the test has this basic structure: try { // lots of thread state manipulation of target } finally { thread.getLog(); } and as David noticed in his comment ( the last comment in [2] ) if an exception occurs anywhere in the try block we can hang waiting for the join() in getLog() because we haven't executed the logic that tells the thread to terminate. So the fix puts a timeout on the join() which means the test will no longer timeout but it will still fail when whatever was leading to the timeout now happens. So as a diagnostic fix this seems fine. Hopefully the logger will show what we need to see and determine the real underlying problem. If this change is really just diagnostic in nature, then it should be a subtask. However, it seems to me it will actually hide the failure. The test won't get a timeout and won't print the log. Am I missing something? Also, after reading through the bug comments it looks like the getLog()/join() timeout issue is different from the main issue that caused the CR to be filed in the first place. Comments regarding the initial problem are: "According to the stack trace the test seems to hang on trying to load the 'java.lang.Math' class concurrently. " "Need to see some native stacks to understand why the classloading thread is not proceeding even though RUNNABLE." "I should have looked at the test first - it uses Thread.suspend and Thread.resume and so is inherently deadlock prone." Does this issue no longer exist, or have we decided that since the test is expected to be deadlock prone to just ignore it. thanks, Chris Thanks, David - Testing: Running a modified test that explicitly throws a runtime exception inside the try block shows the fix solves the problem. Mach5 tier1-tier3 tests passed. Mach5 tier4-tier5 tests are in progress. [1] http://cr.openjdk.java.net/~dtitov/8081652/webrev.01/ [2] https://bugs.openjdk.java.net/browse/JDK-8081652 Thank you, Daniil
PING: Re: RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE method(s) found
PING: One more review for this fix is needed. Thanks, Serguei On 6/3/20 09:52, serguei.spit...@oracle.com wrote: Thank you a lot for review, Coleen! Serguei On 6/3/20 08:50, coleen.phillim...@oracle.com wrote: Hi Serguei, This change looks great. Thank you for fixing this! Coleen On 5/28/20 7:16 PM, serguei.spit...@oracle.com wrote: Hi Coleen, The updated webrev version is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.3/ It has your suggestions addressed: - remove log_is_enabled conditions - move ResourceMark's out of loops Thanks, Serguei On 5/28/20 14:44, serguei.spit...@oracle.com wrote: Hi Coleen, Thank you a lot for reviewing this! On 5/28/20 12:48, coleen.phillim...@oracle.com wrote: Hi Serguei, Sorry for the delay reviewing this again. On 5/18/20 3:30 AM, serguei.spit...@oracle.com wrote: Hi Coleen and potential reviewers, Now, the webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/ has a complete fix for all three failure modes related to the guarantee about OLD and OBSOLETE methods. The root cause are the following optimizations: 1) Optimization based on the flag ik->is_being_redefined(): The problem is that the cpcache method entries of such classes are not being adjusted. It is explained below in the initial RFR summary. The fix is to get rid of this optimization. This seems like a good thing to do even though (actually especially because) I can't re-imagine the logic that went into this optimization. Probably, I've not explained it well enough. The logic was that the class marked as is_being_redefined was considered as being redefined in the current redefinition operation. For classes redefined in current redefinition the cpcache is empty, so there is nothing to adjust. The problem is that classes can be marked as is_being_redefined by doit_prologue of one of the following redefinition operations. In such a case, the VM_RedefineClasses::CheckClass::do_klass fails with this guarantee. It is because the VM_RedefineClasses::CheckClass::do_klass does not have this optimization and does not skip such classes as the VM_RedefineClasses::AdjustAndCleanMetadata::do_class. Without this catch this issue could have unknown consequences in the future execution far away from the root cause. 2) Optimization for array classes based on the flag _has_redefined_Object. The problem is that the vtable method entries are not adjusted for array classes. The array classes have to be adjusted even if the java.lang.Object was redefined by one of previous VM_RedefineClasses operation, not only if it was redefined in the current VM_RedefineClasses operation. The fix is is follow this requirement. This I can't understand. The redefinitions are serialized in safepoints, so why would you need to replace vtable entries for arrays if java.lang.Object isn't redefined in this safepoint? The VM_RedefineClasses::CheckClass::do_klass fails with the same guarantee because of this. It never fails this way with this optimization relaxed. I've already broke my head trying to understand it. It can be because of another bug we don't know yet. 3) Optimization based on the flag _has_null_class_loader which assumes that the Hotspot
Re: RFR(XS): 8234882: JVM TI StopThread should only allow ThreadDeath
Hi David, The JetBrains confirmed: Ability to select the exception is a valuable feature they provide. Throwing only ThreadDeath is almost useless. So, should I close this and related JDI/JDWP enhancements as WNF? Thanks, Serguei On 6/1/20 08:30, serguei.spit...@oracle.com wrote: Hi David, I'll check with JetBrains on this. Thank you to Dan and you for raising this concern. The JetBrains use case you posted in the CSR looks like valid and useful. Thanks, Serguei On 6/1/20 00:46, David Holmes wrote: Hi Serguei, Sorry, I think we have to re-think this change. As Dan flags in the CSR request debuggers directly expose this API as part of the debugger interface, so any change here will directly impact those tools. At a minimum I think we would need to consult with the tool developers about the impact of making this change, as well as whether it makes any practical difference in the sense that there may be other (less convenient but still available) mechanisms to achieve the same goal in a debugger or agent. David On 31/05/2020 5:50 pm, serguei.spit...@oracle.com wrote: Hi David, Also jumping to end. On 5/30/20 06:50, David Holmes wrote: Hi Serguei, Jumping to the end for now ... On 30/05/2020 5:50 am, serguei.spit...@oracle.com wrote: Hi David and reviewers, The updated webrev version is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.2/src/ This update adds testing that StopThread can return JVMTI_ERROR_INVALID_OBJECT error code. The JVM TI StopThread spec is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.2/docs/specs/jvmti.html#StopThread There is a couple of comments below. On 5/29/20 06:18, David Holmes wrote: On 29/05/2020 6:24 pm, serguei.spit...@oracle.com wrote: On 5/29/20 00:56, serguei.spit...@oracle.com wrote: On 5/29/20 00:42, serguei.spit...@oracle.com wrote: Hi David, Thank you for reviewing this! On 5/28/20 23:57, David Holmes wrote: Hi Serguei, On 28/05/2020 3:12 pm, serguei.spit...@oracle.com wrote: Hi David, I've updated the CSR and webrev in place. The changes are: - addressed David's suggestion to rephrase StopThread description change - replaced JVMTI_ERROR_INVALID_OBJECT with JVMTI_ERROR_ILLEGAL_ARGUMENT - updated the implementation in jvmtiEnv.cpp to return JVMTI_ERROR_ILLEGAL_ARGUMENT - updated one of the nsk.jvmti StopThread tests to check error case with the JVMTI_ERROR_ILLEGAL_ARGUMENT I'm reposting the links for convenience. Enhancement: https://bugs.openjdk.java.net/browse/JDK-8234882 CSR draft: https://bugs.openjdk.java.net/browse/JDK-8245853 Spec updates are good - thanks. Thank you for the CSR review. Webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/src/ src/hotspot/share/prims/jvmtiEnv.cpp The ThreadDeath check is fine but I'm a bit confused about the additional null check that leads to JVMTI_ERROR_INVALID_OBJECT. I can't see how resolve_external_guard can return NULL when not passed in NULL. Nor why that would result in JVMTI_ERROR_INVALID_OBJECT rather than JVMTI_ERROR_NULL_POINTER. And I note JVMTI_ERROR_NULL_POINTER is not even a listed error for StopThread! This part of the change seems unrelated to this issue. I was also surprised with the JVMTI_ERROR_NULL_POINTER and JVMTI_ERROR_INVALID_OBJECT error codes. The JVM TI spec automatic generation adds these two error codes for a jobject parameter. Also, they both are from the Universal Errors section: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/docs/specs/jvmti.html#universal-error You can find a link to this section at the start of the Error section: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-stop-thread.1/docs/specs/jvmti.html#StopThread My understanding (not sure, it is right) is that NULL has to be reported with JVMTI_ERROR_NULL_POINTER and a bad jobject (for instance, a WeakReference with a GC-ed target) has to be reported with JVMTI_ERROR_INVALID_OBJECT. At least, I was not able to construct a test case to get this error code returned. So, I'm puzzled with this. I'll try to find some examples with JVMTI_ERROR_NULL_POINTER errors. Found the explanation. The JDI file: src/jdk.jdi/share/classes/com/sun/tools/jdi/JDWPException.java has a fragment that translate the INVALID_OBJECT error to the ObjectCollectedException: RuntimeException toJDIException() { switch (errorCode) { case JDWP.Error.INVALID_OBJECT: return new ObjectCollectedException(); So, the INVALID_OBJECT is for a jobject handle that is referencing a collected object. It means that previous implementation incorrectly returned JVMTI_ERROR_NULL_POINTER error code. I should create and delete local or global ref to construct a test case for this. Interesting that the JDWPException::toJDIException() does not convert the ILLEGAL_ARGUMENT error code to an IllegalArgumentException. I've just ad
Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently
Hi Daniil, Ok, I misread getLog() and see that is now always returns the log. Do you think 60 seconds is a bit long? Isn't the expectation that the join should happen almost immediately or not at all? I don't think you need a separate bug, but you should document in the bug what currently can and can't be reproduce and what is being fixed. thanks, Chris On 6/3/20 12:08 PM, Daniil Titov wrote: Hi Chris, I was not able to reproduce the original issue anymore in Mach5. However, the test itself has a potential for a deadlock (that was also reported) and in the proposed change we fix it. The log still should be printed and the expectation is that we will be able to see the underlying problem in it if it ever reproduced. I could create a separate bug ( not sure if the subtask is a good fit here since the change fixes some problem in the test ) and close the current one as not reproducible if you think it is a better approach. Regarding Thread.suspend() and Thread.resume() methods the test also checks the thread state after these methods are invoked and since these deprecated methods are still in API I don’t think we should exclude them from being tested. Best regards, Daniil From: Chris Plummer Date: Wednesday, June 3, 2020 at 11:40 AM To: David Holmes , Daniil Titov , serviceability-dev Subject: Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently On 6/1/20 12:10 AM, David Holmes wrote: Hi Daniil, On 30/05/2020 10:07 am, Daniil Titov wrote: Please review a change [1] that fixes an intermittent test timeout. The main logic of the test has this basic structure: try { // lots of thread state manipulation of target } finally { thread.getLog(); } and as David noticed in his comment ( the last comment in [2] ) if an exception occurs anywhere in the try block we can hang waiting for the join() in getLog() because we haven't executed the logic that tells the thread to terminate. So the fix puts a timeout on the join() which means the test will no longer timeout but it will still fail when whatever was leading to the timeout now happens. So as a diagnostic fix this seems fine. Hopefully the logger will show what we need to see and determine the real underlying problem. If this change is really just diagnostic in nature, then it should be a subtask. However, it seems to me it will actually hide the failure. The test won't get a timeout and won't print the log. Am I missing something? Also, after reading through the bug comments it looks like the getLog()/join() timeout issue is different from the main issue that caused the CR to be filed in the first place. Comments regarding the initial problem are: "According to the stack trace the test seems to hang on trying to load the 'java.lang.Math' class concurrently. " "Need to see some native stacks to understand why the classloading thread is not proceeding even though RUNNABLE." "I should have looked at the test first - it uses Thread.suspend and Thread.resume and so is inherently deadlock prone." Does this issue no longer exist, or have we decided that since the test is expected to be deadlock prone to just ignore it. thanks, Chris Thanks, David - Testing: Running a modified test that explicitly throws a runtime exception inside the try block shows the fix solves the problem. Mach5 tier1-tier3 tests passed. Mach5 tier4-tier5 tests are in progress. [1] http://cr.openjdk.java.net/~dtitov/80
Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently
Hi Chris, I was not able to reproduce the original issue anymore in Mach5. However, the test itself has a potential for a deadlock (that was also reported) and in the proposed change we fix it. The log still should be printed and the expectation is that we will be able to see the underlying problem in it if it ever reproduced. I could create a separate bug ( not sure if the subtask is a good fit here since the change fixes some problem in the test ) and close the current one as not reproducible if you think it is a better approach. Regarding Thread.suspend() and Thread.resume() methods the test also checks the thread state after these methods are invoked and since these deprecated methods are still in API I don’t think we should exclude them from being tested. Best regards, Daniil From: Chris Plummer Date: Wednesday, June 3, 2020 at 11:40 AM To: David Holmes , Daniil Titov , serviceability-dev Subject: Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently On 6/1/20 12:10 AM, David Holmes wrote: Hi Daniil, On 30/05/2020 10:07 am, Daniil Titov wrote: Please review a change [1] that fixes an intermittent test timeout. The main logic of the test has this basic structure: try { // lots of thread state manipulation of target } finally { thread.getLog(); } and as David noticed in his comment ( the last comment in [2] ) if an exception occurs anywhere in the try block we can hang waiting for the join() in getLog() because we haven't executed the logic that tells the thread to terminate. So the fix puts a timeout on the join() which means the test will no longer timeout but it will still fail when whatever was leading to the timeout now happens. So as a diagnostic fix this seems fine. Hopefully the logger will show what we need to see and determine the real underlying problem. If this change is really just diagnostic in nature, then it should be a subtask. However, it seems to me it will actually hide the failure. The test won't get a timeout and won't print the log. Am I missing something? Also, after reading through the bug comments it looks like the getLog()/join() timeout issue is different from the main issue that caused the CR to be filed in the first place. Comments regarding the initial problem are: "According to the stack trace the test seems to hang on trying to load the 'java.lang.Math' class concurrently. " "Need to see some native stacks to understand why the classloading thread is not proceeding even though RUNNABLE." "I should have looked at the test first - it uses Thread.suspend and Thread.resume and so is inherently deadlock prone." Does this issue no longer exist, or have we decided that since the test is expected to be deadlock prone to just ignore it. thanks, Chris Thanks, David - Testing: Running a modified test that explicitly throws a runtime exception inside the try block shows the fix solves the problem. Mach5 tier1-tier3 tests passed. Mach5 tier4-tier5 tests are in progress. [1] http://cr.openjdk.java.net/~dtitov/8081652/webrev.01/ [2] https://bugs.openjdk.java.net/browse/JDK-8081652 Thank you, Daniil
Re: RFR JDK-8232222: Set state to 'linked' when an archived class is restored at runtime
Hi Jiangli, My point is, if we are introducing a behavioral change in an optimization, it should be carefully reviewed. That's a rule we have followed all along. Can we first push your change without the JVMTI behavioral change, and discuss the JVMTI behavioral change separately? If I understand your proposal correctly, in this RFE, we will first set boot classes to "linked" during restore_unsharable_info(). Subsequently, we will do the same thing for other classes. And after that, we will set classes to 'initialized' during restore_unsharable_info(). If that's the proposal, then we will be introducing more and more behavioral changes that are not only observable by JVMTI but also by Java code. I think we should discuss all these together to see if this is indeed the direction we want to take. Project Leyden is the right forum. Thanks - Ioi On 6/3/20 9:34 AM, Jiangli Zhou wrote: Hi Ioi, Monitoring agents are alway enabled in cloud production environments. The costs for agents are constant and always exist. The main motivation for the CDS work during the last several years was for cloud environments. Could you please explain why you think CDS should not be used for startup saving with JVMTI agents in cloud? Or, this and related future optimizations should not be enabled in that case? Majority of the Java startup improvement since OpenJDK 9 was achieved by small incremental improvements. Each such change has been a small saving only. Some of them were small enough and only measurable by instruction counts. However they were all worth the work and have been submitted to OpenJDK. As a result, we are seeing a good total startup improvement today with CDS enabled. This change is no exception. Even the saving is small, but it still should be done. Although I don't have data with agent enabled, I have provided performance data for before and after the change since the very beginning. In addition, I have also explained a few times that this change enables future optimizations for more general class pre-initialization approach. This is an important step for future work. So doing it right is crucial. Regards, Jiangli On Tue, Jun 2, 2020 at 9:34 PM Ioi Lam wrote: Hi Jiangli, Before we spend time on the CSR review, do you have any data that shows the actual benefit of doing this? I am specifically asking about the benefit to JVMTI agents. As I mentioned before, there's an alternative, which is to not use the optimization when JVMTI is enabled. I don't think we should spend time worrying about the impact to JVMTI agents unless there's a compelling reasons to do so. Thanks - Ioi On 6/2/20 5:19 PM, Jiangli Zhou wrote: Here is the CSR: https://bugs.openjdk.java.net/browse/JDK-8246289. David, I described that the JVM spec allows for eager or lazy linking and agents shouldn't rely on the timing/ordering, as you suggested. Please review the CSR. It's been a while since I've worked on a CSR, could you please remind me if the CSR should be proposed before reviewing? I can revert it to draft state if draft is the correct state before reviewing. Thanks! Best regards, Jiangli
Re: RFR: 8081652: java/lang/management/ThreadMXBean/ThreadMXBeanStateTest.java timed out intermittently
On 6/1/20 12:10 AM, David Holmes wrote: Hi Daniil, On 30/05/2020 10:07 am, Daniil Titov wrote: Please review a change [1] that fixes an intermittent test timeout. The main logic of the test has this basic structure: try { // lots of thread state manipulation of target } finally { thread.getLog(); } and as David noticed in his comment ( the last comment in [2] ) if an exception occurs anywhere in the try block we can hang waiting for the join() in getLog() because we haven't executed the logic that tells the thread to terminate. So the fix puts a timeout on the join() which means the test will no longer timeout but it will still fail when whatever was leading to the timeout now happens. So as a diagnostic fix this seems fine. Hopefully the logger will show what we need to see and determine the real underlying problem. If this change is really just diagnostic in nature, then it should be a subtask. However, it seems to me it will actually hide the failure. The test won't get a timeout and won't print the log. Am I missing something? Also, after reading through the bug comments it looks like the getLog()/join() timeout issue is different from the main issue that caused the CR to be filed in the first place. Comments regarding the initial problem are: "According to the stack trace the test seems to hang on trying to load the 'java.lang.Math' class concurrently. " "Need to see some native stacks to understand why the classloading thread is not proceeding even though RUNNABLE." "I should have looked at the test first - it uses Thread.suspend and Thread.resume and so is inherently deadlock prone." Does this issue no longer exist, or have we decided that since the test is expected to be deadlock prone to just ignore it. thanks, Chris Thanks, David - Testing: Running a modified test that explicitly throws a runtime exception inside the try block shows the fix solves the problem. Mach5 tier1-tier3 tests passed. Mach5 tier4-tier5 tests are in progress. [1] http://cr.openjdk.java.net/~dtitov/8081652/webrev.01/ [2] https://bugs.openjdk.java.net/browse/JDK-8081652 Thank you, Daniil
Re: RFR(XS): 8245126 Kitchensink fails with: assert(!method->is_old()) failed: Should not be installing old methods
Hi Dean, The updated webrev is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.3/ Probably, the JVMCI part can be simplified. Only the compile_state line has to be moved up: +JVMCICompileState compile_state(task); // Skip redefined methods -if (target_handle->is_old()) { +if (compile_state.target_method_is_old()) { failure_reason = "redefined method"; retry_message = "not retryable"; compilable = ciEnv::MethodCompilable_never; } else { - JVMCICompileState compile_state(task); Fixes in the jvmciEnv.?pp are not really needed Please, let me know what do you think. This version does not fail at all (in 300 runs for both C2 and JVMCI). It seems, other two issues disappeared as well: This was seen with the C2: https://bugs.openjdk.java.net/browse/JDK-8245128 This was seen with the JVMCI: https://bugs.openjdk.java.net/browse/JDK-8245446 Thanks, Serguei On 6/1/20 23:40, serguei.spit...@oracle.com wrote: Hi Dean, Thank you for the reply. The problem is I do not fully understand your suggestion, especially the part about caching the method,is_old() value in the cache_jvmti_state(). This is a preliminary webrev where I tried to implement your suggestion: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/kitchensink-comp.2/ This variant is failing in half of test runs for both C1/C2 and JVMCI. I think, the root cause is a safepoint in a ThreadInVMfromNative desctructor. Here: 232 void ciEnv::cache_jvmti_state() { 233 VM_ENTRY_MARK; Then we check for the target_method_is_old() value which is not up-to-date any more. I feel, it was correct and more simple before introducing this approach. Probably, I'm missing something here. I also have a question about the update fragment: 1696 { 1697 // Must switch to native to allocate ci_env 1698 ThreadToNativeFromVM ttn(thread); 1699 ciEnv ci_env((CompileTask*)NULL); 1700 1701 // Switch back to VM state to do compiler initialization 1702 ThreadInVMfromNative tv(thread); 1703 ResetNoHandleMark rnhm; 1704 1705 // Perform per-thread and global initializations 1706 comp->initialize(); 1707 } Can we remove the ciEnv object initialization above with the state transitions? Or it has some side effects? Please, let me know what you think. Thanks, Serguei On 6/1/20 15:10, Dean Long wrote: On 5/31/20 11:16 PM, serguei.spit...@oracle.com wrote: Hi Dean, To check the is_old as you suggest the target method has to be passed to the cache_jvmti_state() as argument. Is it what you are suggesting? I believe you can use use _task->method()->is_old(), as the ciEnv already has the task. Just want to make sure I understand you correctly. The cache_jvmti_state() and cache_dtrace_flags() are called in the CompileBroker::init_compiler_runtime() for a ciEnv with the NULL CompileTask which looks unnecessary (or I don't understand it): bool CompileBroker::init_compiler_runtime() { CompilerThread* thread = CompilerThread::current(); . . . ciEnv ci_env((CompileTask*)NULL); // Cache Jvmti state ci_env.cache_jvmti_state(); // Cache DTrace flags ci_env.cache_dtrace_flags(); These calls look unnecessary to me, as the ci_env will cache these again before compiling a method. I suggest removing these calls. We should make sure the cache fields are initialized to sane values in the ciEnv ctor. The JVMCI has a separate implementation for ciEnv which is jvmciEnv and its own set of cache_jvmti_state() and jvmti_state_changed() functions. Both are not called in the JVMCI case. So, these checks look as broken in JVMCI now. JVMCI is in better shape, because it doesn't transition out of _thread_in_vm state, but yes it needs similar changes. Not sure, I have enough compiler knowledge to fix this at this stage of release. Wo
Re: RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE method(s) found
Thank you a lot for review, Coleen! Serguei On 6/3/20 08:50, coleen.phillim...@oracle.com wrote: Hi Serguei, This change looks great. Thank you for fixing this! Coleen On 5/28/20 7:16 PM, serguei.spit...@oracle.com wrote: Hi Coleen, The updated webrev version is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.3/ It has your suggestions addressed: - remove log_is_enabled conditions - move ResourceMark's out of loops Thanks, Serguei On 5/28/20 14:44, serguei.spit...@oracle.com wrote: Hi Coleen, Thank you a lot for reviewing this! On 5/28/20 12:48, coleen.phillim...@oracle.com wrote: Hi Serguei, Sorry for the delay reviewing this again. On 5/18/20 3:30 AM, serguei.spit...@oracle.com wrote: Hi Coleen and potential reviewers, Now, the webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/ has a complete fix for all three failure modes related to the guarantee about OLD and OBSOLETE methods. The root cause are the following optimizations: 1) Optimization based on the flag ik->is_being_redefined(): The problem is that the cpcache method entries of such classes are not being adjusted. It is explained below in the initial RFR summary. The fix is to get rid of this optimization. This seems like a good thing to do even though (actually especially because) I can't re-imagine the logic that went into this optimization. Probably, I've not explained it well enough. The logic was that the class marked as is_being_redefined was considered as being redefined in the current redefinition operation. For classes redefined in current redefinition the cpcache is empty, so there is nothing to adjust. The problem is that classes can be marked as is_being_redefined by doit_prologue of one of the following redefinition operations. In such a case, the VM_RedefineClasses::CheckClass::do_klass fails with this guarantee. It is because the VM_RedefineClasses::CheckClass::do_klass does not have this optimization and does not skip such classes as the VM_RedefineClasses::AdjustAndCleanMetadata::do_class. Without this catch this issue could have unknown consequences in the future execution far away from the root cause. 2) Optimization for array classes based on the flag _has_redefined_Object. The problem is that the vtable method entries are not adjusted for array classes. The array classes have to be adjusted even if the java.lang.Object was redefined by one of previous VM_RedefineClasses operation, not only if it was redefined in the current VM_RedefineClasses operation. The fix is is follow this requirement. This I can't understand. The redefinitions are serialized in safepoints, so why would you need to replace vtable entries for arrays if java.lang.Object isn't redefined in this safepoint? The VM_RedefineClasses::CheckClass::do_klass fails with the same guarantee because of this. It never fails this way with this optimization relaxed. I've already broke my head trying to understand it. It can be because of another bug we don't know yet. 3) Optimization based on the flag _has_null_class_loader which assumes that the Hotspot does not support delegation from the bootstrap class loader to a user-defined class loader. The assumption is that if the current class being redefined has a user-defined class loader as its defining class loader, then all classes loaded by the bootstrap class loader can be skipped for vtable/it
Re: RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE method(s) found
Hi Serguei, This change looks great. Thank you for fixing this! Coleen On 5/28/20 7:16 PM, serguei.spit...@oracle.com wrote: Hi Coleen, The updated webrev version is: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.3/ It has your suggestions addressed: - remove log_is_enabled conditions - move ResourceMark's out of loops Thanks, Serguei On 5/28/20 14:44, serguei.spit...@oracle.com wrote: Hi Coleen, Thank you a lot for reviewing this! On 5/28/20 12:48, coleen.phillim...@oracle.com wrote: Hi Serguei, Sorry for the delay reviewing this again. On 5/18/20 3:30 AM, serguei.spit...@oracle.com wrote: Hi Coleen and potential reviewers, Now, the webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/ has a complete fix for all three failure modes related to the guarantee about OLD and OBSOLETE methods. The root cause are the following optimizations: 1) Optimization based on the flag ik->is_being_redefined(): The problem is that the cpcache method entries of such classes are not being adjusted. It is explained below in the initial RFR summary. The fix is to get rid of this optimization. This seems like a good thing to do even though (actually especially because) I can't re-imagine the logic that went into this optimization. Probably, I've not explained it well enough. The logic was that the class marked as is_being_redefined was considered as being redefined in the current redefinition operation. For classes redefined in current redefinition the cpcache is empty, so there is nothing to adjust. The problem is that classes can be marked as is_being_redefined by doit_prologue of one of the following redefinition operations. In such a case, the VM_RedefineClasses::CheckClass::do_klass fails with this guarantee. It is because the VM_RedefineClasses::CheckClass::do_klass does not have this optimization and does not skip such classes as the VM_RedefineClasses::AdjustAndCleanMetadata::do_class. Without this catch this issue could have unknown consequences in the future execution far away from the root cause. 2) Optimization for array classes based on the flag _has_redefined_Object. The problem is that the vtable method entries are not adjusted for array classes. The array classes have to be adjusted even if the java.lang.Object was redefined by one of previous VM_RedefineClasses operation, not only if it was redefined in the current VM_RedefineClasses operation. The fix is is follow this requirement. This I can't understand. The redefinitions are serialized in safepoints, so why would you need to replace vtable entries for arrays if java.lang.Object isn't redefined in this safepoint? The VM_RedefineClasses::CheckClass::do_klass fails with the same guarantee because of this. It never fails this way with this optimization relaxed. I've already broke my head trying to understand it. It can be because of another bug we don't know yet. 3) Optimization based on the flag _has_null_class_loader which assumes that the Hotspot does not support delegation from the bootstrap class loader to auser-defined class loader.The assumption is that if the current class being redefined has a user-defined classloader as its defining class loader, then allclasses loaded by the bootstrap class loader can be skipped for vtable/itable method entries adjustment. The problem is that this assumption is not really correct. There are classes that still need the adjustment. For instance, the class java.util.IdentityHashMap$KeyIterator loaded by the bootstrap class loader has the vtable/itable references to the method: java.util.Iterator.forEachRemaining(java.util.function.Consumer) The class java.util.Iterator is defined by a user-defined class loader. The fix is to get rid of this optimization. Also with this optimization, I'm not sure what the logic was that determined that this was safe, so it's best to remove it. Above makes sense. I don't know the full theory behind this optimization. We only have a comment. All three failure modes are observed with the -Xcomp flag. With all three fixes above in place, the Kitchensink does not fail with this guarantee anymore. http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/cpCache.cpp.udiff.html For logging, the log_trace function will also repeat the 'if' statement and not allocate the external_name() if logging isn't specified, so you don't need the 'if' statement above. + if (log_is_enabled(Trace, redefine, class, update)) { + log_trace(redefine, class, update, constantpool) + ("cpc %s entry update: %s", entry_type, new_method->external_name()); http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/klassVtable.cpp.udiff.html Same in two cases here, and you could move the ResourceMark outside the loop at the top. Good suggestions, taken. Thanks! Serguei
Re: RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE method(s) found
On 5/28/20 5:44 PM, serguei.spit...@oracle.com wrote: Hi Coleen, Thank you a lot for reviewing this! On 5/28/20 12:48, coleen.phillim...@oracle.com wrote: Hi Serguei, Sorry for the delay reviewing this again. On 5/18/20 3:30 AM, serguei.spit...@oracle.com wrote: Hi Coleen and potential reviewers, Now, the webrev: http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/ has a complete fix for all three failure modes related to the guarantee about OLD and OBSOLETE methods. The root cause are the following optimizations: 1) Optimization based on the flag ik->is_being_redefined(): The problem is that the cpcache method entries of such classes are not being adjusted. It is explained below in the initial RFR summary. The fix is to get rid of this optimization. This seems like a good thing to do even though (actually especially because) I can't re-imagine the logic that went into this optimization. Probably, I've not explained it well enough. The logic was that the class marked as is_being_redefined was considered as being redefined in the current redefinition operation. For classes redefined in current redefinition the cpcache is empty, so there is nothing to adjust. The problem is that classes can be marked as is_being_redefined by doit_prologue of one of the following redefinition operations. In such a case, the VM_RedefineClasses::CheckClass::do_klass fails with this guarantee. It is because the VM_RedefineClasses::CheckClass::do_klass does not have this optimization and does not skip such classes as the VM_RedefineClasses::AdjustAndCleanMetadata::do_class. Without this catch this issue could have unknown consequences in the future execution far away from the root cause. Yes this makes sense. Two threads are redefining a set of classes in parallel, not at a safepoint: t1: class A, B, C => marks them all as is_being_redefined t2: class D, E, F => marks these as is_being_redefined safepoint classes A, B, C are finishing redefinition in doit() so have their Methods replaced, and with is_being_redefine set for D, E, F the optimization was skipping replacing their Methods. One of these classes D could have had a B::foo() in the vtable or cpCache. crash in the check_classes! 2) Optimization for array classes based on the flag _has_redefined_Object. The problem is that the vtable method entries are not adjusted for array classes. The array classes have to be adjusted even if the java.lang.Object was redefined by one of previous VM_RedefineClasses operation, not only if it was redefined in the current VM_RedefineClasses operation. The fix is is follow this requirement. This I can't understand. The redefinitions are serialized in safepoints, so why would you need to replace vtable entries for arrays if java.lang.Object isn't redefined in this safepoint? The VM_RedefineClasses::CheckClass::do_klass fails with the same guarantee because of this. It never fails this way with this optimization relaxed. I've already broke my head trying to understand it. It can be because of another bug we don't know yet. Me neither but that's fine. Remove the optimization! Coleen 3) Optimization based on the flag _has_null_class_loader which assumes that the Hotspot does not support delegation from the bootstrap class loader to auser-defined class loader.The assumption is that if the current class being redefined has a user-defined classloader as its defining class loader, then allclasses loaded by the bootstrap class loader can be skipped for vtable/itable method entries adjustment. The problem is that this assumption is not really correct. There are classes that still need the adjustment. For instance, the class java.util.IdentityHashMap$KeyIterator loaded by the bootstrap class loader has the vtable/itable references to the method: java.util.Iterator.forEachRemaining(java.util.function.Consumer) The class java.util.Iterator is defined by a user-defined class loader. The fix is to get rid of this optimization. Also with this optimization, I'm not sure what the logic was that determined that this was safe, so it's best to remove it. Above makes sense. I don't know the full theory behind this optimization. We only have a comment. All three failure modes are observed with the -Xcomp flag. With all three fixes above in place, the Kitchensink does not fail with this guarantee anymore. http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/cpCache.cpp.udiff.html For logging, the log_trace function will also repeat the 'if' statement and not allocate the external_name() if logging isn't specified, so you don't need the 'if' statement above. + if (log_is_enabled(Trace, redefine, class, update)) { + log_trace(redefine, class, update, constantpool) + ("cpc %s entry update: %s", entry_type, new_method->external_name()); http://cr.openjdk.java.net
RE: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
Hi Ralf, I had a look at your change, webrev.3. Thanks for contributing this! Overall, a nicely engineered piece of work. Thus, my comments are mostly minor details: diagnosticCommand.hpp ok. diagnosticCommand.cpp: l.510 I would be a bit more precise in the comment: ..."9 the slowest level with the best compression." or maybe "strongest compression"? l. 528 I would appreciate if you fixed the existing comment wrt. to the language: // Request a full GC before dumping the heap if _all is false. // This helps reduce the amount of unreachable objects in the dump heapDumper.hpp ok. heapDumper.cpp Error messags is now recorded in _backend. ok. Not overwriting file is moved to FileWriter, ok. I like how you split the existing code with few changes to distribute the work to the thread gang, nice! l.1808 // Now we clear the global variables, so that a future dumper might run. Is "might" correct? Isn't is "can"? l.1819 // Write the file header - we always use 1.0. You lost the ".2" from 1.0.2. heapDumperCompression.hpp Usually, in the include guards, only '/' are replaced by '_'. l.31 Extra whitespace before "implementation". l.36 Initialized --> Initializes Return --> Returns it initialized --> initializes l.119 works --> WriteWorks ... I had to think about this a while to figure it's not a typo of 'work' but names WriteWork instances in short. But the term is used throughout the code, so maybe leave it as-is. l.163 Remove "to". l.165 returns the old --> commits the old ... or the like. l.210 type-o maxiumum heapDumperCompression.cpp It's a bit confusing that the static variable is called gzip_func (referring to a dedicated function), while there is a method load_gzip_func that loads any function from the gzip library. What about gzip_zip_func for the variable? l.113 What's the point of increasing needed_out_size after the call? You increment the pointer? l.125 add "of the": good choice of the buffer sizes CompressionBackend(): The check not to overwrite the inital, first error is in set_error(). ok. l.224 I think the comment should say "write the last remaining partially" l.400 I had one overall question, which I think is ansered here at least partially: As I understand, writing the dump now needs more buffer memory, as there are several WriteWorks held at the same time. Are they smaller than the buffer used before, so no additional memory is needed, or is there a fallback if only a few can be allocated? Is the fallback implemented here implicitly? Just because if there is no memory for more works, the algorithm uses the ones it could allocate, which might result in some idle threads as there are less works than threads? This makes it more flexible wrt. to available memory than the implementation before, right? l.441 indentation l.458 I can't understand why this variable is named "left". Is this past tense of to leave? Or do you mean the left, filled, side of the buffer? Another question. The basic dumping is done sequential, right? The comression is parallel. Is there a tradeoff in #of threads where the compression is faster than writing? zip_util.c Looks good. I appreciate the precise error message handling you are doing. Could you please add comments that these functions are used for heap dump compression? HprofReader.java ok. Reader.java Should you close in and in2 in case of error? GzipRandomAccess.java l.146 closes -> close l.158 "the the" This file nicely demonstrates how to read the zipped hprof. Maybe you can add a hint in the JBS issue to this file? HeapDumpCompressedTest.java ok. The other Tests: Please merge them all into HeapDumpCompressedTest by using repeated @test comments. You might not be aware this is supported by jtreg. See test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java for an example. It will run each @test block sperately and evaluate the @requires as expected. Best regards, Goetz. -Original Message- From: serviceability-dev On Behalf Of Schmelter, Ralf Sent: Montag, 18. Mai 2020 09:23 To: Langer, Christoph Cc: serviceability-dev@openjdk.java.net; hotspot-runtime-...@openjdk.java.net runtime Subject: [CAUTION] RE: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Christoph, I've updated the webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.3/ The significant changes are moving most of the new compression code to its own file, changing to use a single option (see CSR) called -gz with a mandatory compression level and to load the zlib only once (analog to the new class loader code). Additionally I've removed some long lines. Best regards, Ralf -Original Message- From: Langer, Christoph Sent: Friday, 1 May 2020 18:46 To: Schmelter, Ralf Cc: serviceability-dev@openjdk.java.net; hotspot-runtime-...@openjdk.java.net runtime ; coleen.phillim...@oracle.com Subject: RE: RFR(L) 8237354: Add option to jcmd to