Re: RFR 8230303 : JDB hangs when running monitor command
Thank you Serguei for review! On 9/9/19 9:14 PM, serguei.spit...@oracle.com wrote: Hi Ivan, Thank you for filing the shadow bug and fixing this issue! I've targeted the bug to 14 (please, fix me if it is wrong). Yes, this is correct, thanks. The fix looks good to me. Do you have any plans to backport it to the earlier releases? Yes, I'm planning to request a backport to the JDK 13 shortly. With kind regards, Ivan
RFR 8230303 : JDB hangs when running monitor command
Hello! jdb utility has a command 'monitor ', which allows to execute the specified command every time the debuggee stops. If the modifies the list of installed monitors (the simplest example is 'monitor unmonitor 1'), then jdb hits a ConcurrentModificationException, and hangs the debuggee. While it is questionable, if modifying the monitor list has to be implemented in some specific way, it seems sensible to at least prevent a hard failure. The simplest fix appears to be to use CopyOnWriteArrayList, so that an immutable snapshot of the list will be traversed. Even if the list is modified, it wouldn't affect any iterator that might exist at that moment of time. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8230303 WEBREV: http://cr.openjdk.java.net/~igerasim/8230303/00/webrev/ Mach5 control build was successfully run with hs-tier7-rt, which includes the new test and other jdb related tests. Would you please help review? Thanks in advance! -- With kind regards, Ivan Gerasimov
Re: RFR 8160892: VM warning: WaitForMultipleObjects timed out
Hey! I've updated the webrev in place at http://cr.openjdk.java.net/~igerasim/8160892/01/webrev/ On 15.07.2016 0:31, Daniel D. Daugherty wrote: On 7/14/16 12:46 PM, Ivan Gerasimov wrote: Thank you David for looking into this! Here's the webrev updated in accordance with your and Daniel's suggestions: http://cr.openjdk.java.net/~igerasim/8160892/01/webrev/ src/os/windows/vm/os_windows.cpp L3900: bool registered = false; Please consider adding a comment above this variable: // We only attempt to register threads until a process exiting // thread manages to set the process_exiting flag. Any threads // that come through here after the process_exiting flag is set // are unregistered and will be caught in the SuspendThread() // infinite loop below. Sure. Added this comment. My request is different than David's. I prefer the comment up here by the 'registered' variable because that variable is key for getting into the SuspendThread() infinite loop. The comment also occurs before all the code paths that you have trace out with the different Ept types and flags states. I'm really hoping that this is the last tweak we need to make to clearly comment what we're doing to alleviate this OS race. The number of occurrences of the underlying bug keep going down with every iteration so we're getting close. Yes. This is my hope too. With kind regards, Ivan
Re: RFR 8160892: VM warning: WaitForMultipleObjects timed out
Thank you David for looking into this! Here's the webrev updated in accordance with your and Daniel's suggestions: http://cr.openjdk.java.net/~igerasim/8160892/01/webrev/ Please see my answers inline Nit: can we change 'registered_itself" to just "registered" please. Done. Can you explain under what conditions a thread will now reach the self-suspension code. Is that only if an error occurred such that we were unable to register our handle for the process-exiting thread to wait on? If so some commentary on that block seems appropriate - perhaps more appropriate there than back up at the place where it failed to get the handle (as Dan requested). There are three kinds of threads, which can be caught in that self-suspension loop: 1) All threads that want to end (by calling _endthreadex()) *after* some process-exiting thread raised the flag `process_exiting`. The rationale here is that we know that the whole process is going to be terminated quite soon, so we do not allow any thread to call _endthreadex(), which seems to have the concurrency bug. 2) Any thread that wants to end the whole process, after some other thread raised the flag `process_exiting`. If more than one threads want to end the process, we let to do it only the thread that could raise the flag `process_exiting`. All other such threads will have to suspend themselves. 3) (Unlikely to happen in practice) Any thread that wants to end by calling _endthreadex(), but which failed to register itself due to failure of DuplicateHandle(). Here we still have a race, which can result in a wrong exit code of the process. Given we keep missing conditions I'm only cautiously optimistic about this. And I'd like to understand how we still sometimes end up exiting with an "error code" that seems to be the value of an exception! :( The last time the sentinel exit code =20115 was reported almost a year ago. After that the fix for JDK-8145127 had gone in, and I didn't see any more reports about wrong exit codes since then. In particular, that fix worked around the situation when more than one threads concurrently call System.exit(), which might have caused a race. With kind regards, Ivan Thanks, David With kind regards, Ivan
Re: RFR 8160892: VM warning: WaitForMultipleObjects timed out
Thanks Daniel! I'll rearrange comments as you suggest. With kind regards, Ivan On 07.07.2016 0:21, Daniel D. Daugherty wrote: On 7/6/16 10:57 AM, Ivan Gerasimov wrote: Hello! When fixing JDK-8069048 a potential race was overlooked: 1) Thread A wants to call _endthreadex() and registers itself, 2) Thread B wants to call exit(), takes its turn and raises the flag `process_exiting`, 3) Thread A continues, and gets captured in the loop at 4020, in SuspendThread(GetCurrentThread()), 4) Now, the thread B will have to wait for all the registered threads, including the thread A, which will never wake up. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8160892 WEBREV: http://cr.openjdk.java.net/~igerasim/8160892/00/webrev/ src/os/windows/vm/os_windows.cpp Just to make sure the race is clear (in my mind at least): Thread A - executes if-block at L3913-3968; in particular it registers itself on L3957 - executes if-block at L4017-4027; in the old code, the thread would block for ever in L4022-4026; in the new code, the thread will only block if it did not register itself. Thread B: - executes if-block at L3906-3910 and succeeds in setting the process_exiting flag - executes if-block at L3969-4012; in particular Thread B calls WaitForMultipleObjects() on L3994; in the old code, WaitForMultipleObjects() times out because Thread A is blocked. L3963: registered_itself = true; L3964: } L3965: L3966: // The current exiting thread has stored its handle in the array, and now L3967: // should leave the critical section before calling _endthreadex(). The existing comment on L3966-3967 belongs after L3963 and before L3964, i.e., at the end of the else-block. If you agree with moving the comment on L3966-3967, then consider this next request: L3959: warning("DuplicateHandle failed (%u) in %s: %d\n", L3960: GetLastError(), __FILE__, __LINE__); Please consider adding the following comment after the warning: // We can't register this thread (no more handles) so this thread // may be racing with a thread that is calling exit(). If the thread // that is calling exit() has managed to set the process_exiting // flag, then this thread will be caught in the SuspendThread() // infinite loop below which closes that race. A small timing // window remains before the process_exiting flag is set, but it // is only exposed when we are out of handles. L4021: // don't let the current thread proceed to exit() or _endthreadex() Clarify comment: 'current thread' -> 'current unregistered thread' Very nice and very quick job in finding this race! Thumbs up on what you have since my comments are only related to moving/adding comments. Dan With kind regards, Ivan
RFR 8160892: VM warning: WaitForMultipleObjects timed out
Hello! When fixing JDK-8069048 a potential race was overlooked: 1) Thread A wants to call _endthreadex() and registers itself, 2) Thread B wants to call exit(), takes its turn and raises the flag `process_exiting`, 3) Thread A continues, and gets captured in the loop at 4020, in SuspendThread(GetCurrentThread()), 4) Now, the thread B will have to wait for all the registered threads, including the thread A, which will never wake up. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8160892 WEBREV: http://cr.openjdk.java.net/~igerasim/8160892/00/webrev/ With kind regards, Ivan
RFR 8160892: VM warning: WaitForMultipleObjects timed out
Hello! When fixing JDK-8069048 a potential race was overlooked: 1) Thread A wants to call _endthreadex() and registers itself, 2) Thread B wants to call exit(), takes its turn and raises the flag `process_exiting`, 3) Thread A continues, and gets captured in the loop at 4020, in SuspendThread(GetCurrentThread()), 4) Now, the thread B will have to wait for all the registered threads, including the thread A, which will never wake up. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8160892 WEBREV: http://cr.openjdk.java.net/~igerasim/8160892/00/webrev/ With kind regards, Ivan
Re: RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
Thanks David and Daniel! Yes, as David wrote, a thread-id is 32-bit on Windows. I'll do all the suggested changes and will push the fix later today. Sincerely yours, Ivan On 06.01.2016 7:25, David Holmes wrote: On 6/01/2016 3:33 AM, Daniel D. Daugherty wrote: Happy New Year! And to you :) > http://cr.openjdk.java.net/~igerasim/8145127/01/webrev/ src/os/windows/vm/os_windows.cpp L3923: Atomic::cmpxchg((jint)GetCurrentThreadId(), _exiting, 0); What's the return size of GetCurrentThreadId()? Are we It is a DWORD so 32-bit. Cheers, David - down casting a larger size into a 'jint'? If so, that might allow more than one thread into this code when we only want one... Also please consider adding a comment above this line. Perhaps something like: // Atomically set process_exiting before the critical section // to increase the visibility between racing threads. L3998: if (portion_count > MAXIMUM_WAIT_OBJECTS) { L3999: portion_count = MAXIMUM_WAIT_OBJECTS; Wrong indent; should be two spaces relative to L3998. L4013: portion_count = handle_count - i; Please consider adding a comment for this error case. Perhaps something like: // Reset portion_count so we close the remaining // handles due to this error. L4030: if (OrderAccess::load_acquire(_exiting) != 0 && L4031: process_exiting != (jint)GetCurrentThreadId()) { L4032: while (true) { L4033: // Some other thread is about to call exit(), so we L4034: // don't let the current thread proceed to exit() or _endthreadex() The comment on L4033-4 is slightly misplaced now. It should be between L4031 and L4032. Thumbs up modulo the GetCurrentThreadId() return size question above. If that size is OK and you choose to make the comment changes, I don't need to see another webrev. Dan On 12/23/15 5:20 AM, Ivan Gerasimov wrote: Thank you David for review! Sincerely yours, Ivan On 23.12.2015 7:41, David Holmes wrote: Looks okay. Second review needed though. Thanks, David On 19/12/2015 10:28 PM, Ivan Gerasimov wrote: We will suspend the current thread if two conditions are satisfied: process_exiting != 0 and process_exiting != current thread id. But, yes, pr_ex isn't really needed, as we can use process_exiting directly. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/8145127/01/webrev/ I still see pr_ex ?? Oops. Forgot to 'hg qrefresh' before generating the webrev. The webrev has just been updated in place. Sincerely yours, Ivan
Re: RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
Thank you David for review! Sincerely yours, Ivan On 23.12.2015 7:41, David Holmes wrote: Looks okay. Second review needed though. Thanks, David On 19/12/2015 10:28 PM, Ivan Gerasimov wrote: We will suspend the current thread if two conditions are satisfied: process_exiting != 0 and process_exiting != current thread id. But, yes, pr_ex isn't really needed, as we can use process_exiting directly. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/8145127/01/webrev/ I still see pr_ex ?? Oops. Forgot to 'hg qrefresh' before generating the webrev. The webrev has just been updated in place. Sincerely yours, Ivan
Re: RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
We will suspend the current thread if two conditions are satisfied: process_exiting != 0 and process_exiting != current thread id. But, yes, pr_ex isn't really needed, as we can use process_exiting directly. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/8145127/01/webrev/ I still see pr_ex ?? Oops. Forgot to 'hg qrefresh' before generating the webrev. The webrev has just been updated in place. Sincerely yours, Ivan
Re: RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
On 17.12.2015 8:45, David Holmes wrote: On 15/12/2015 10:18 PM, Ivan Gerasimov wrote: while(true) would convey that much more clearly - and perhaps obviate the need for pr_ex. Yes, I can surely transform the code - while (pr_ex != curr_id) { + if (pr_ex != curr_id) { + while (true} { The intention was to save a line :-) I'll use while (true), if it improves readability. It does :) And I don't you need pr_ex then as you can just compare OrderAccess::load_acquire(_exiting) with the current thread id - right? We will suspend the current thread if two conditions are satisfied: process_exiting != 0 and process_exiting != current thread id. But, yes, pr_ex isn't really needed, as we can use process_exiting directly. Here's the updated webrev: http://cr.openjdk.java.net/~igerasim/8145127/01/webrev/ Sincerely yours, Ivan
Re: RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
while(true) would convey that much more clearly - and perhaps obviate the need for pr_ex. Yes, I can surely transform the code - while (pr_ex != curr_id) { + if (pr_ex != curr_id) { + while (true} { The intention was to save a line :-) I'll use while (true), if it improves readability. Sincerely yours, Ivan
Re: RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
On 15.12.2015 9:31, David Holmes wrote: I really wish we had some view inside windows to see exactly what is going wrong here. :( Yes, so do I! It would be really helpful to know exactly how this race bug is "implemented". Or, at least, how to deal with it to avoid the return code replacement for sure. On 15/12/2015 12:09 AM, Ivan Gerasimov wrote: Hello! There was a timeout observed in os_windows.cpp at the line 3945 res = WaitForMultipleObjects(MAXIMUM_WAIT_OBJECTS, handles, FALSE, EXIT_TIMEOUT); This means there were more than 64 threads exiting at the same time and none of the first 64 threads could make any progress during 5 minutes. Sigh. To address this issue I suggest two things: 1) Increase the size of the queue of exiting threads. We'll still have to wait for only the first 64 threads, if the queue is exhausted. But the chances we hit this condition are greatly reduced. 2) Raise process_exiting flag earlier, i.e. before trying to enter the critical section. This should decrease the number of threads, contending for a slot in the 'handles' array during the process exit. Additionally, it is proposed to suspend all the threads, but the one which raised the flag 'process_exiting'. It would be important in a case, when two threads are about to call exit() concurrently. Otherwise, a race could be faced, if the first thread is waiting for all the registered handles, while the second one skips the critical section altogether and calls ::exit(). Build went fine on all platforms. The JTREG tests from 'hotspot' subset all pass. Would you please help review the proposed fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8145127 WEBREV: http://cr.openjdk.java.net/~igerasim/8145127/00/webrev/ Can't quite get my head round the whole strategy, but in this loop: 4033 while (pr_ex != curr_id) { pr_ex is never updated! It's intentional. It's not meant the current thread will ever leave this loop, once enters. The first line of the loop body will effectively suspend the current thread, so it will never call _endthreadex(), exit() or _exit() itself. Just for the extra protection, the SuspendThread() call is wrapped into an infinite loop. Sincerely yours, Ivan
RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...
Hello! There was a timeout observed in os_windows.cpp at the line 3945 res = WaitForMultipleObjects(MAXIMUM_WAIT_OBJECTS, handles, FALSE, EXIT_TIMEOUT); This means there were more than 64 threads exiting at the same time and none of the first 64 threads could make any progress during 5 minutes. Sigh. To address this issue I suggest two things: 1) Increase the size of the queue of exiting threads. We'll still have to wait for only the first 64 threads, if the queue is exhausted. But the chances we hit this condition are greatly reduced. 2) Raise process_exiting flag earlier, i.e. before trying to enter the critical section. This should decrease the number of threads, contending for a slot in the 'handles' array during the process exit. Additionally, it is proposed to suspend all the threads, but the one which raised the flag 'process_exiting'. It would be important in a case, when two threads are about to call exit() concurrently. Otherwise, a race could be faced, if the first thread is waiting for all the registered handles, while the second one skips the critical section altogether and calls ::exit(). Build went fine on all platforms. The JTREG tests from 'hotspot' subset all pass. Would you please help review the proposed fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8145127 WEBREV: http://cr.openjdk.java.net/~igerasim/8145127/00/webrev/ Sincerely yours, Ivan
Re: RFR (XXS) 8069068: VM warning: WaitForMultipleObjects timed out (0) ...
On 20.05.2015 10:04, David Holmes wrote: Hi Ivan, On 20/05/2015 9:54 AM, Ivan Gerasimov wrote: Hello! Since the beginning of the year a warning about an expired timeout has been seen a few times. It is most likely to observe this under a heavy load, when many threads are exiting. The propose is to increase the timeout. The rationale is that if an application needs more time for the threads to exit, let's give it that time. Rather than just bumping the timeout a little I would suggest bumping it a lot - make it a few minutes. If that fails then it suggests we have a real problem not just a slow or loaded machine. Otherwise if we just keep bumping the timeout it serves no purpose. Alright. Then let's make it 5 minutes for both product and debug bits: WEBREV: http://cr.openjdk.java.net/~igerasim/8069068/01/webrev/ Sincerely yours, Ivn Cheers, David BUGURL: https://bugs.openjdk.java.net/browse/JDK-8069068 WEBREV: http://cr.openjdk.java.net/~igerasim/8069068/00/webrev/ Sincerely yours, Ivan
Re: RFR (XXS) 8069068: VM warning: WaitForMultipleObjects timed out (0) ...
On 20.05.2015 18:19, Daniel D. Daugherty wrote: On 5/20/15 7:57 AM, Ivan Gerasimov wrote: On 20.05.2015 10:04, David Holmes wrote: Hi Ivan, On 20/05/2015 9:54 AM, Ivan Gerasimov wrote: Hello! Since the beginning of the year a warning about an expired timeout has been seen a few times. It is most likely to observe this under a heavy load, when many threads are exiting. The propose is to increase the timeout. The rationale is that if an application needs more time for the threads to exit, let's give it that time. Rather than just bumping the timeout a little I would suggest bumping it a lot - make it a few minutes. If that fails then it suggests we have a real problem not just a slow or loaded machine. Otherwise if we just keep bumping the timeout it serves no purpose. Alright. Then let's make it 5 minutes for both product and debug bits: WEBREV: http://cr.openjdk.java.net/~igerasim/8069068/01/webrev/ Thumbs up. If I remember correctly, we'll only hit the 5 minute timeout if we end up with the bottleneck where too many threads are queued up to leave... and none of the threads that are leaving make progress... Yes. That's correct. Thank you Dan. Sincerely yours, Ivan Dan Sincerely yours, Ivn Cheers, David BUGURL: https://bugs.openjdk.java.net/browse/JDK-8069068 WEBREV: http://cr.openjdk.java.net/~igerasim/8069068/00/webrev/ Sincerely yours, Ivan
RFR (XXS) 8069068: VM warning: WaitForMultipleObjects timed out (0) ...
Hello! Since the beginning of the year a warning about an expired timeout has been seen a few times. It is most likely to observe this under a heavy load, when many threads are exiting. The propose is to increase the timeout. The rationale is that if an application needs more time for the threads to exit, let's give it that time. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8069068 WEBREV: http://cr.openjdk.java.net/~igerasim/8069068/00/webrev/ Sincerely yours, Ivan
Re: RFR 8069048: (process) Suspend finishing threads when process exits [win]
Thank you Daniel for your review! On 16.01.2015 3:01, Daniel D. Daugherty wrote: On 1/15/15 5:09 AM, Ivan Gerasimov wrote: Hello everyone! This is yet another iteration in the attempt to solve the 'wrong exit code' bug on Windows [1]. The wrong exit code has been observed once again with one of the regression tests. The suspicion is that this might be due to the critical section had been destroyed before all the children threads were terminated. In that case, one of the threads had been unblocked and called _endthreadex(), which resulted in a race. To address this scenario, it is proposed to make the thread that is about to exit the process raise a flag. If the flag is raised, any threads wishing to exit should instead suspend themselves. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8069048 WEBREV: http://cr.openjdk.java.net/~igerasim/8069048/0/webrev/ src/os/windows/vm/os_windows.cpp line 3895: // don't let the current thread to proceed to _endthreadex() Typo: 'let the current thread to proceed to' - 'let the current thread proceed to' Just making sure that I understand the revised algorithm: - before the EPT_PROCESS thread gets here, EPT_THREAD threads will work as before and call line 3909 _endthreadex() - after the EPT_PROCESS thread gets here and sets the flag on line 3886: OrderAccess::release_store(process_exiting, 1); - an EPT_THREAD thread may have made it past flag check on line 3802: } else if (OrderAccess::load_acquire(process_exiting) == 0) { but it will be blocked on line 3803: EnterCriticalSection(crit_sect); - an EPT_THREAD thread that sees the flag set on line 3802 will drop into the self-suspend block on lines 3892-3900 - after the EPT_PROCESS thread exits the critical section, then any EPT_THREAD threads that were blocked trying to acquire the critical section will now see the flag set on line 3805: if (what == EPT_THREAD OrderAccess::load_acquire(process_exiting) == 0) { and drop into the self-suspend block on lines 3892-3900 Short version: any EPT_THREAD threads that arrive after the EPT_PROCESS thread owns the critical section will never call line 3909 _endthreadex() because they self-suspend. Yes, the logic is meant to be precisely as you're describing. OK, I concur that this new algorithm looks correct and will reduce the number of threads racing through line 3909 _endthreadex() while the EPT_PROCESS thread is trying to call exit(). One possible hole remains that we've discussed before. If an EPT_THREAD thread calls _endthreadex() before the EPT_PROCESS thread gets here, and if the EPT_THREAD thread stalls in _endthreadex(), then it's still possible for that EPT_THREAD thread to mess up the exit code if it unblocks after the EPT_PROCESS thread has set the exit code. The EPT_PROCESS thread is waiting for those EPT_THREAD threads that had called _endthreadex() at 3874 res = WaitForMultipleObjects(handle_count, handles, TRUE, EXIT_TIMEOUT). This can timeout , of course, so yes, the window for a race is still open. We've discussed this before and there's nothing we can do about other than try and reduce the probability by reducing the number of EPT_THREAD threads that are calling _endthreadex(). Thumbs up! Side note: A new failure of this mechanism was filed recently: JDK-8069068 VM warning: WaitForMultipleObjects timed out (0) ... https://bugs.openjdk.java.net/browse/JDK-8069068 The bug was filed against JDK9-B45 so it has the most recent fix (https://bugs.openjdk.java.net/browse/JDK-8066863). The weird part is that WaitForMultipleObjects() timed out without an error code being set. Don't know if that means anything in particular in the Win* APIS... This fix (8069048) may also reduce the probability of this failure mode because we'll be queueing fewer threads on the handle list... Right. If this failure had happened during the app termination, the new logic might have helped. It still looks surprising though, as the warning indicates that none of 64 threads that have already called _endthreadex() could not complete execution during 4 second! And one of those threads had the priority above normal. Very strange. I need to try to reproduce this kind of failure locally. Sincerely yours, Ivan
RFR 8069048: (process) Suspend finishing threads when process exits [win]
Hello everyone! This is yet another iteration in the attempt to solve the 'wrong exit code' bug on Windows [1]. The wrong exit code has been observed once again with one of the regression tests. The suspicion is that this might be due to the critical section had been destroyed before all the children threads were terminated. In that case, one of the threads had been unblocked and called _endthreadex(), which resulted in a race. To address this scenario, it is proposed to make the thread that is about to exit the process raise a flag. If the flag is raised, any threads wishing to exit should instead suspend themselves. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8069048 WEBREV: http://cr.openjdk.java.net/~igerasim/8069048/0/webrev/ [1] https://bugs.openjdk.java.net/browse/JDK-6573254 Sincerely yours, Ivan
Re: Change 8057744: (process) Synchronize exiting of threads and process [win] breaks HotSpot on Win Server 2003
Hi Volker! Yes, I do hope to port this fix to jdk8u, once it is proved to fix the bug. However, as you correctly noticed, it will require some reorganization of the code: the initialization of the critical section will be done with initialization of other global variables. I preferred to do it with InitOnceExecuteOnce, since it helps keep all this workaround code in one place. JDK-8057744 is subtask of another confidential bug (due to internal links), and Jira does not allow to set different visibility level for subtask. Sincerely yours, Ivan On 23.12.2014 17:31, Volker Simonis wrote: Hi Ivan, I have just realized that the change 8057744 breaks HotSpot on Windows Server 2003 because the InitOnceExecuteOnce function isn't supported there. I suppose you're aware of this as the change contains the following comment in os_windows.cpp: // Must be at least Windows Vista or Server 2008 to use InitOnceExecuteOnce #define _WIN32_WINNT 0x0600 I just wanted to ask if there are any plans to downport this change to jdk8 because we currently still still support jdk8 HotSpot on Server 2003. Unfortunately the bug 8057744 is closed so I can not subscribe for notifications. If there are no special reasons for keeping it closed, can you please make it visible to everybody. Thank you and best regards, Volker On Sun, Sep 7, 2014 at 8:07 AM, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hello! This is a proposal to address issue with wrong exit codes from a Java processes on Windows. In order to avoid a race, calls to _endthread(), exit and _exit() are explicitly synchronized. We allow simultaneous calls to _endthread() by multiple threads. However, at the time exit() or _exit() is called, no calls to _endthread() are allowed. Some instrumentation added with JDK-8055338 remain in the code to help diagnose the failures if they still occur. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8057744 WEBREV: http://cr.openjdk.java.net/~igerasim/8057744/0/webrev/ Sincerely yours, Ivan
Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects
On 11.12.2014 13:01, David Holmes wrote: On 11/12/2014 7:48 PM, David Holmes wrote: Hi Ivan, On 11/12/2014 4:52 PM, Ivan Gerasimov wrote: Hello! After the fix for JDK-8064694 some more failures of WaitForMultipleObjects() were observed under heavy load. The reason was that the limitation on wait object number was overlooked. The total number of the objects should not be greater than MAXIMUM_WAIT_OBJECTS (= 64). The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use MAXIMUM_WAIT_OBJECTS instead for all kinds of builds. I also added the last error code to the failure reports, so it would be easier to identify the cause of a failure. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ The webrev changes do not correspond to the description you gave above. Correct webrev URL: http://cr.openjdk.java.net/~igerasim/8066863/0/webrev/ Thank you David for correcting the link and the review! Seems this saga will never end. :( Changes seem okay. I still have a hope to have it finished one day. Sincerely yours, Ivan Thanks, David David Sincerely yours, Ivan
Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects
On 11.12.2014 19:05, Daniel D. Daugherty wrote: On 12/11/14 3:01 AM, David Holmes wrote: On 11/12/2014 7:48 PM, David Holmes wrote: Hi Ivan, On 11/12/2014 4:52 PM, Ivan Gerasimov wrote: Hello! After the fix for JDK-8064694 some more failures of WaitForMultipleObjects() were observed under heavy load. The reason was that the limitation on wait object number was overlooked. The total number of the objects should not be greater than MAXIMUM_WAIT_OBJECTS (= 64). The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use MAXIMUM_WAIT_OBJECTS instead for all kinds of builds. I also added the last error code to the failure reports, so it would be easier to identify the cause of a failure. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ The webrev changes do not correspond to the description you gave above. Correct webrev URL: http://cr.openjdk.java.net/~igerasim/8066863/0/webrev/ src/os/windows/vm/os_windows.cpp Thanks for adding the GetLastError() info to the messages. Thumbs up. RT_Baseline has already pushed to Main_Baseline for this week so you should be good to go if you're happy with your pre-push testing. Thank you Daniel for the review! I've run a JPRT job + hotspot test set, with so single failure. Sincerely yours, Ivan Seems this saga will never end. :( Changes seem okay. On the plus side, we're seeing fewer and fewer exit_code stomping failures in nightly so things are getting better... Dan Thanks, David David Sincerely yours, Ivan
RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects
Hello! After the fix for JDK-8064694 some more failures of WaitForMultipleObjects() were observed under heavy load. The reason was that the limitation on wait object number was overlooked. The total number of the objects should not be greater than MAXIMUM_WAIT_OBJECTS (= 64). The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use MAXIMUM_WAIT_OBJECTS instead for all kinds of builds. I also added the last error code to the failure reports, so it would be easier to identify the cause of a failure. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ Sincerely yours, Ivan
Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Thank you Daniel! David, are you still Okay with the updated webrev? Comparing to the previous one, I've added setting the priority of the current thread at the line 3880 and changed the priority level to from HIGHEST to ABOVE_NORMAL. Sincerely yours, Ivan On 18.11.2014 18:27, Daniel D. Daugherty wrote: http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/ src/os/windows/vm/os_windows.cpp No commments. Thumbs up. Dan On 11/18/14 12:29 AM, Ivan Gerasimov wrote: Hi Markus! The priority of the exiting thread will be raised for quite a short period of time -- right before the thread finishes exiting. There are two places where the priority is adjusted. Under normal conditions we should never see the first place hit. However, if we do, this means we have a huge number of threads. Raising the priority of one of them is a hint about which thread we want the scheduler to focus on. The second place is a bit different. We have several threads running immediately before ending the process. Some of them are at the exiting path and block exiting of the whole process. Raising the priority of those threads is a way to say we're not interested in all the other threads, as they are going to be terminated anyway. I just noticed that in second scenario it may be appropriate to set the priority of the current thread to the same level as for the exiting threads. This way it'll be given a fair chance to continue if the timeout expires. I also think it should be enough to set the priority level to THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST. It will give just +1 to the priority value -- should be enough for the hint. Would you please take a look at the updated webrev: http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/ Sincerely yours, Ivan On 17.11.2014 11:33, Markus Grönlund wrote: I agree with David. The side effects will be unknown and very hard to debug. Is there another way to accomplish the results without manipulating base services? Thanks Markus -Original Message- From: David Holmes Sent: den 17 november 2014 07:40 To: Ivan Gerasimov; Daniel Daugherty Cc: serviceability-dev@openjdk.java.net; hotspot-runtime-dev Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844 On 17/11/2014 7:23 AM, Ivan Gerasimov wrote: Thank you Daniel! Please find the updated webrev with your suggestions incorporated here: http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/ Concerning the thread priority: If the application is of NORMAL_PRIORITY_CLASS, then setting the thread's priority level to THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10 (of maximum 31). http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs. 85).aspx And if the process is HIGH_PRIORITY_CLASS, then the tread with the HIGHEST priority level will have priority value == 15 of 31. I believe, it should not be too much, and the machine will not become busy with only those closing threads. However, I hope it would be enough to make them complete faster than other threads of the NORMAL priority level withing the same application. I don't think this is necessary or desirable. Under normal usage we're giving priority to exiting threads and that may disrupt the usual scheduling patterns that applications see. You may posit that it is harmless but we can't say that for sure. Nor can we actually know that this will help with this particular bug. I would not add in this new code. David Sincerely yours, Ivan On 15.11.2014 2:22, Daniel D. Daugherty wrote: On 11/14/14 5:35 AM, Ivan Gerasimov wrote: Hello! The recent fix for JDK-8059533 ((process) Make exiting process wait for exiting threads [win]) caused the warning message to be printed in some test environments: --- os_windows.cpp:3844 is in the newly updated os::win32::exit_process_or_thread(Ept what, int exit_code) --- This has been observed with debug builds on highly loaded systems. To address the issue it is proposed to do three things: 1) increase the timeout for debug builds, 2) increase the maximum number of the thread handles to be stored, 3) rise the priority of the exiting threads, if we need to wait for them. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ src/os/windows/vm/os_windows.cpp line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128) Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? That uses the smaller value for only one build config (PRODUCT). line 3785: #define EXIT_TIMEOUT NOT_DEBUG(1000) DEBUG_ONLY(4000) /*1 sec in product, 4 sec in debug*/ Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? Please add spaces between
Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Thanks David! On 17.11.2014 9:40, David Holmes wrote: On 17/11/2014 7:23 AM, Ivan Gerasimov wrote: Thank you Daniel! Please find the updated webrev with your suggestions incorporated here: http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/ Concerning the thread priority: If the application is of NORMAL_PRIORITY_CLASS, then setting the thread's priority level to THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10 (of maximum 31). http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx And if the process is HIGH_PRIORITY_CLASS, then the tread with the HIGHEST priority level will have priority value == 15 of 31. I believe, it should not be too much, and the machine will not become busy with only those closing threads. However, I hope it would be enough to make them complete faster than other threads of the NORMAL priority level withing the same application. I don't think this is necessary or desirable. Under normal usage we're giving priority to exiting threads and that may disrupt the usual scheduling patterns that applications see. You may posit that it is harmless but we can't say that for sure. Nor can we actually know that this will help with this particular bug. I would not add in this new code. There are two places where I put adjusting the thread's priority: 1) We've the array of handles filled up. If we're found in this code branch, it'll mean that unfortunately we've already got broken exit pattern, because the current thread has to do a blocking call, having the ownership of a critical section. The full array of handles means that many threads are exiting at that time, thus all the threads that are starting to exit after the current one will block at the attempt to grab ownership of the critical section. Raising the priority of one thread that had already reached _endthreadex(), seems appropriate to me in such a situation, because it helps shorten the period of time when the threads remain blocked. Choosing the oldest exiting thread ensures that the period of time when the priority of one thread is higher is the smallest possible. 2) The process exit branch. That's the main part of the fix -- here we make the process to wait for all the threads having called _endthreadex() to complete, at the same time preventing any other threads from starting the exiting procedure. The execution flow is already changed here (I don't want to say disrupted, because it was meant to fix the issue). All running threads are about to be terminated soon by ending the process, so raising the priority of some of the threads should not have any bad impact on the program flow. Instead, it may make the time the process has to wait before calling exit() shorter. I can surely remove that playing with the threads' priority, as it's not the essential part of the fix. However, I think it's a useful hint to the scheduler, which can improve things in some situations, and I'm not really sure how it can harm. Sincerely yours, Ivan David Sincerely yours, Ivan On 15.11.2014 2:22, Daniel D. Daugherty wrote: On 11/14/14 5:35 AM, Ivan Gerasimov wrote: Hello! The recent fix for JDK-8059533 ((process) Make exiting process wait for exiting threads [win]) caused the warning message to be printed in some test environments: --- os_windows.cpp:3844 is in the newly updated os::win32::exit_process_or_thread(Ept what, int exit_code) --- This has been observed with debug builds on highly loaded systems. To address the issue it is proposed to do three things: 1) increase the timeout for debug builds, 2) increase the maximum number of the thread handles to be stored, 3) rise the priority of the exiting threads, if we need to wait for them. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ src/os/windows/vm/os_windows.cpp line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128) Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? That uses the smaller value for only one build config (PRODUCT). line 3785: #define EXIT_TIMEOUT NOT_DEBUG(1000) DEBUG_ONLY(4000) /*1 sec in product, 4 sec in debug*/ Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? Please add spaces between the comment delimiters and the comment text. That uses the smaller timeout for only one build config (PRODUCT). line 3836 // Rise the priority... Typo: 'Rise' - 'Raise' About the general idea of raising the exiting thread's priority, if the exiting thread is looping in some Win* OS code after this point, will raising the priority make the machine unusable? Dan The fix was tested on all available platforms, with the hotspot testset. No failures. Sincerely yours, Ivan
Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Hi Markus! The priority of the exiting thread will be raised for quite a short period of time -- right before the thread finishes exiting. There are two places where the priority is adjusted. Under normal conditions we should never see the first place hit. However, if we do, this means we have a huge number of threads. Raising the priority of one of them is a hint about which thread we want the scheduler to focus on. The second place is a bit different. We have several threads running immediately before ending the process. Some of them are at the exiting path and block exiting of the whole process. Raising the priority of those threads is a way to say we're not interested in all the other threads, as they are going to be terminated anyway. I just noticed that in second scenario it may be appropriate to set the priority of the current thread to the same level as for the exiting threads. This way it'll be given a fair chance to continue if the timeout expires. I also think it should be enough to set the priority level to THREAD_PRIORITY_ABOVE_NORMAL instead of THREAD_PRIORITY_HIGHEST. It will give just +1 to the priority value -- should be enough for the hint. Would you please take a look at the updated webrev: http://cr.openjdk.java.net/~igerasim/8064694/2/webrev/ Sincerely yours, Ivan On 17.11.2014 11:33, Markus Grönlund wrote: I agree with David. The side effects will be unknown and very hard to debug. Is there another way to accomplish the results without manipulating base services? Thanks Markus -Original Message- From: David Holmes Sent: den 17 november 2014 07:40 To: Ivan Gerasimov; Daniel Daugherty Cc: serviceability-dev@openjdk.java.net; hotspot-runtime-dev Subject: Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844 On 17/11/2014 7:23 AM, Ivan Gerasimov wrote: Thank you Daniel! Please find the updated webrev with your suggestions incorporated here: http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/ Concerning the thread priority: If the application is of NORMAL_PRIORITY_CLASS, then setting the thread's priority level to THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10 (of maximum 31). http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs. 85).aspx And if the process is HIGH_PRIORITY_CLASS, then the tread with the HIGHEST priority level will have priority value == 15 of 31. I believe, it should not be too much, and the machine will not become busy with only those closing threads. However, I hope it would be enough to make them complete faster than other threads of the NORMAL priority level withing the same application. I don't think this is necessary or desirable. Under normal usage we're giving priority to exiting threads and that may disrupt the usual scheduling patterns that applications see. You may posit that it is harmless but we can't say that for sure. Nor can we actually know that this will help with this particular bug. I would not add in this new code. David Sincerely yours, Ivan On 15.11.2014 2:22, Daniel D. Daugherty wrote: On 11/14/14 5:35 AM, Ivan Gerasimov wrote: Hello! The recent fix for JDK-8059533 ((process) Make exiting process wait for exiting threads [win]) caused the warning message to be printed in some test environments: --- os_windows.cpp:3844 is in the newly updated os::win32::exit_process_or_thread(Ept what, int exit_code) --- This has been observed with debug builds on highly loaded systems. To address the issue it is proposed to do three things: 1) increase the timeout for debug builds, 2) increase the maximum number of the thread handles to be stored, 3) rise the priority of the exiting threads, if we need to wait for them. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ src/os/windows/vm/os_windows.cpp line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128) Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? That uses the smaller value for only one build config (PRODUCT). line 3785: #define EXIT_TIMEOUT NOT_DEBUG(1000) DEBUG_ONLY(4000) /*1 sec in product, 4 sec in debug*/ Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? Please add spaces between the comment delimiters and the comment text. That uses the smaller timeout for only one build config (PRODUCT). line 3836 // Rise the priority... Typo: 'Rise' - 'Raise' About the general idea of raising the exiting thread's priority, if the exiting thread is looping in some Win* OS code after this point, will raising the priority make the machine unusable? Dan The fix was tested on all available platforms, with the hotspot testset. No failures. Sincerely
Re: RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Thank you Daniel! Please find the updated webrev with your suggestions incorporated here: http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/ Concerning the thread priority: If the application is of NORMAL_PRIORITY_CLASS, then setting the thread's priority level to THREAD_PRIORITY_HIGHEST will result in its priority value to be only 10 (of maximum 31). http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx And if the process is HIGH_PRIORITY_CLASS, then the tread with the HIGHEST priority level will have priority value == 15 of 31. I believe, it should not be too much, and the machine will not become busy with only those closing threads. However, I hope it would be enough to make them complete faster than other threads of the NORMAL priority level withing the same application. Sincerely yours, Ivan On 15.11.2014 2:22, Daniel D. Daugherty wrote: On 11/14/14 5:35 AM, Ivan Gerasimov wrote: Hello! The recent fix for JDK-8059533 ((process) Make exiting process wait for exiting threads [win]) caused the warning message to be printed in some test environments: --- os_windows.cpp:3844 is in the newly updated os::win32::exit_process_or_thread(Ept what, int exit_code) --- This has been observed with debug builds on highly loaded systems. To address the issue it is proposed to do three things: 1) increase the timeout for debug builds, 2) increase the maximum number of the thread handles to be stored, 3) rise the priority of the exiting threads, if we need to wait for them. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ src/os/windows/vm/os_windows.cpp line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128) Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? That uses the smaller value for only one build config (PRODUCT). line 3785: #define EXIT_TIMEOUT NOT_DEBUG(1000) DEBUG_ONLY(4000) /*1 sec in product, 4 sec in debug*/ Instead of NOT_DEBUG can you use PRODUCT_ONLY? Instead of DEBUG_ONLY can you used NOT_PRODUCT? Please add spaces between the comment delimiters and the comment text. That uses the smaller timeout for only one build config (PRODUCT). line 3836 // Rise the priority... Typo: 'Rise' - 'Raise' About the general idea of raising the exiting thread's priority, if the exiting thread is looping in some Win* OS code after this point, will raising the priority make the machine unusable? Dan The fix was tested on all available platforms, with the hotspot testset. No failures. Sincerely yours, Ivan
RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Hello! The recent fix for JDK-8059533 ((process) Make exiting process wait for exiting threads [win]) caused the warning message to be printed in some test environments: --- os_windows.cpp:3844 is in the newly updated os::win32::exit_process_or_thread(Ept what, int exit_code) --- This has been observed with debug builds on highly loaded systems. To address the issue it is proposed to do three things: 1) increase the timeout for debug builds, 2) increase the maximum number of the thread handles to be stored, 3) rise the priority of the exiting threads, if we need to wait for them. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694 WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/ The fix was tested on all available platforms, with the hotspot testset. No failures. Sincerely yours, Ivan
RFR 8062647: Wrong indentation of arguments of annotated methods
Hello everybody! I noticed that the javadoc tool may produce the doc with misaligned arguments of annotated methods in the 'method details' section. For example: http://jre.us.oracle.com/java/re/jdk/9/nightly/latest/docs/api/java/util/EnumSet.html#of-E-E...- Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8062647 WEBREV: http://cr.openjdk.java.net/~igerasim/8062647/0/webrev/ Sincerely yours, Ivan
Re: RFR 8059533: (process) Make exiting process wait for exiting threads [win]
On 29.10.2014 13:15, David Holmes wrote: Great improvement thanks! One further suggestion - after line 3731 add a short overview eg: // Basic approach: // - Each exiting thread registers its intent to exit and then does so. // - A thread trying to terminate the process must wait for all //threads currently exiting to complete their exit Typo: 3802 // _endthreadex() comleted. Thanks! Fixed the typo and added the comment. The updated webrev is at the same location: http://cr.openjdk.java.net/~igerasim/8059533/1/webrev/ I have a couple of thought about this: First, the thread that ends the process waits for all the exiting threads to complete. By the time it returns from WaitForMultipleObjects, those exiting threads aren't running anymore, so the race is avoided (unless it's the race with scheduler). That's not quite true and the key part of my point. At some point during the thread termination process the exiting thread has to signal any waiting thread - and it is still running at the point. We don't know whether the action that causes interference with the process termination happens before or after the signalling is done. I was thinking if the dedicated scheduler thread can do the signalling. Otherwise, some poorly behaving thread could die without updating its status, leaving other threads waiting for its completion forever. Though, I don't know how it's actually done. I've read in several places (for example, a comment at the bottom of [1]), that waiting for a thread to complete with WaitForXXX() function ensures the exit code of that thread is set. In particular, GetExitCodeThread() should not return STILL_ACTIVE for that thread anymore. This lets me hope that the code, which sets the exit code (which is potentially racy) has already been executed by the time WaitForXXX() returns. [1] http://msdn.microsoft.com/en-us/library/windows/desktop/ms683190(v=vs.85).aspx Anyway this narrows the window even further even if it may not close it completely. That's my hope too :) Otherwise, I'll be (almost) out of ideas how to work this issue around. Sincerely yours, Ivan
Re: RFR 8059533: (process) Make exiting process wait for exiting threads [win]
On 27.10.2014 3:36, David Holmes wrote: On 27/10/2014 1:15 AM, Ivan Gerasimov wrote: David, would you approve this fix? Sorry Ivan I'm having trouble following the logic this time - could you add some comments about what we are checking at each step. Yes, sure. The main idea is to make the thread that ends the process wait for the threads that had finished so far. Thus, we have an array for storing the thread handles. Any thread that is on thread-exit path, first tries to remove the completed threads from the array (to keep the list smaller), and then adds its own handle to the end of the array. The thread that is on process-exit path, calls exit (or _exit), while still owning the critical section. This way we make sure, no other threads execute any exit-related code at the same time. Here's a typical scenario: 1) First thread that decided to end itself calls exit_process_or_thread() -- let's assume it is on thread-exit path. Initializes the critical section. 2) Grabs the ownership of the crit. section 3) The list of thread handles is initially empty, so the thread adds a duplicate of its handle to the array. 4) Releases the crit. section 5) Calls _endthreadex() to terminate itself 6) Another thread enters exit_process_or_thread() -- let it be on thread-exit path as well. 7) Grabs the ownership of the crit. section 8) In a loop checks if any previously ended thread has completed. Here we call WaitForSingleObject with zero timeout, so we don't block. All the handles of completed threads are closed. 9) If there's is a free slot in the array, the thread adds its handle to the end 10) If the array is full (which is very unlikely), the thread waits for ANY thread to complete, and then adds itself to the array. 11) Releases the crit. section 12) Calls _endthreadex() to terminate itself 13) Some thread enters exit_process_or_thread() in order to end the whole process. 14) Grabs the ownership of the crit. section 15) Waits on all the threads that have added their handles to the array (typically there will be only one such thread handle). Since the ownership of the critical section is held, no other threads will execute any exit-related code at this time. 16) Once all the threads from the list have completed, the thread closes the handles and calls exit() (or _exit()), holding the crit. section ownership. We're done. Error handling: in a case of errors, we report them, and proceed with exiting as usual. - If initialization of critical section fails, we'll just call the corresponding exit routine. - If we failed, waiting for an exiting thread to complete, close its handle as if it has completed. - If we failed, waiting for any thread to complete withing a time-out (array is full), close all the handles and continue as if there were no threads exited before. - If we couldn't duplicate the handle, ignore it (don't add it to the array), so no one will wait for it later. - If the thread on the process-exit path failed to wait for the threads to complete withing the time-out, proceed to the exit anyway. All these errors should never happen during normal execution, but if they do, we still try to end threads/process in a way it's done now. In this, later case, we are at risk of observing a race condition. However, the chances of this happening are much lesser, and in addition we'll have a waring message to analyze. Possible bottlenecks. 1) All the threads have to obtain the ownership of the critical section, which effectively serializes all the exiting threads. However, this doesn't appear to make things too much slower, as all the threads already do similar thing in _endthreadex(). 2) Normally, the threads don't block having ownership of the crit. section. The block can only happen if there's no free slot in the array of handles. This can only happen if MAX_EXIT_HANDLES (== 16) threads have just called _endthreadex(), and none of them completed. 3) When the thread at process-exit path waits for all the exiting threads to complete, the time-out of 1 second is specified. If any of those threads do not complete, this can lead to that the application is delayed at the exit. However, we don't block forever, and the delay can only be observed upon a failure. Also we seem to exit while still holding the critical section - how does that work? Right. We make the thread at the process-exit path call exit() from withing critical section block. This way it is ensured no other exit-related code is executed at the same moment, and a race is avoided. Sincerely yours, Ivan Thanks, David Sincerely yours, Ivan On 26.10.2014 19:01, Daniel D. Daugherty wrote: On 10/25/14 12:23 PM, Ivan Gerasimov wrote: On 25.10.2014 3:06, Daniel D. Daugherty wrote: On 10/1/14 3:07 AM, Ivan Gerasimov wrote: Hello! The tests that continue to fail with wrong exit codes suggest that the fix for JDK-8057744 wasn't sufficient. Here's another proposal, which expands the synchronized
Re: RFR 8059533: (process) Make exiting process wait for exiting threads [win]
David, would you approve this fix? Sincerely yours, Ivan On 26.10.2014 19:01, Daniel D. Daugherty wrote: On 10/25/14 12:23 PM, Ivan Gerasimov wrote: On 25.10.2014 3:06, Daniel D. Daugherty wrote: On 10/1/14 3:07 AM, Ivan Gerasimov wrote: Hello! The tests that continue to fail with wrong exit codes suggest that the fix for JDK-8057744 wasn't sufficient. Here's another proposal, which expands the synchronized portion of the code. It is proposed to make the exiting process wait for the threads that have already started exiting. This should help to make sure that no thread is executing any potentially racy code concurrently with the exiting process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ Finally got a chance to look at the official version of fix. Thumbs up! src/os/windows/vm/os_windows.cpp No comments. Thank you Daniel! I assume the change needs the second hotspot reviewer? Yes, HotSpot changes always need two reviewers. David Holmes chimed in on this thread. You should ask him if he can be counted as a reviewer. What would be the best time for pushing this fix? Let's go for Wednesday again so we have a full week of testing to evaluate this latest tweak. Dan Sincerely yours, Ivan Dan P.S. We had another sighting of an exit_code == 60115 test failure this past week so while your previous fix greatly reduced the odds of this race, I'm looking forward to seeing this new version in action... Comments, suggestion are welcome! Sincerely yours, Ivan
Re: RFR 8059533: (process) Make exiting process wait for exiting threads [win]
On 25.10.2014 3:06, Daniel D. Daugherty wrote: On 10/1/14 3:07 AM, Ivan Gerasimov wrote: Hello! The tests that continue to fail with wrong exit codes suggest that the fix for JDK-8057744 wasn't sufficient. Here's another proposal, which expands the synchronized portion of the code. It is proposed to make the exiting process wait for the threads that have already started exiting. This should help to make sure that no thread is executing any potentially racy code concurrently with the exiting process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ Finally got a chance to look at the official version of fix. Thumbs up! src/os/windows/vm/os_windows.cpp No comments. Thank you Daniel! I assume the change needs the second hotspot reviewer? What would be the best time for pushing this fix? Sincerely yours, Ivan Dan P.S. We had another sighting of an exit_code == 60115 test failure this past week so while your previous fix greatly reduced the odds of this race, I'm looking forward to seeing this new version in action... Comments, suggestion are welcome! Sincerely yours, Ivan
RFR 8059533: (process) Make exiting process wait for exiting threads [win]
Hello! The tests that continue to fail with wrong exit codes suggest that the fix for JDK-8057744 wasn't sufficient. Here's another proposal, which expands the synchronized portion of the code. It is proposed to make the exiting process wait for the threads that have already started exiting. This should help to make sure that no thread is executing any potentially racy code concurrently with the exiting process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533 WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/ Comments, suggestion are welcome! Sincerely yours, Ivan
Re: RFR [8057744] (process) Synchronize exiting of threads and process [win]
On 08.09.2014 17:31, Daniel D. Daugherty wrote: Ivan, I'll sponsor the change, but I want to push it after this week's RT_Baseline snapshot. We snapshot on Tuesday (2014.09.09 at 1900 PT) and if the nightly test results look reasonable Wednesday morning, then I'll push it on Wednesday. Does this sound acceptable to you? Sure! Thanks a lot! I was told that as a jdk9 committer I do have a permission to push into hotspot repository. Though, I'm not yet familiar with pushing trough JPRT, so your help will be greatly appreciated! Sincerely yours, Ivan
RFR [8057744] (process) Synchronize exiting of threads and process [win]
Hello! This is a proposal to address issue with wrong exit codes from a Java processes on Windows. In order to avoid a race, calls to _endthread(), exit and _exit() are explicitly synchronized. We allow simultaneous calls to _endthread() by multiple threads. However, at the time exit() or _exit() is called, no calls to _endthread() are allowed. Some instrumentation added with JDK-8055338 remain in the code to help diagnose the failures if they still occur. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8057744 WEBREV: http://cr.openjdk.java.net/~igerasim/8057744/0/webrev/ Sincerely yours, Ivan
Re: RFR [8057744] (process) Synchronize exiting of threads and process [win]
Thanks Daniel and David! I'll need a sponsor to help me push the change. Sincerely yours, Ivan On 08.09.2014 6:03, David Holmes wrote: Hi Ivan, On 7/09/2014 4:07 PM, Ivan Gerasimov wrote: Hello! This is a proposal to address issue with wrong exit codes from a Java processes on Windows. In order to avoid a race, calls to _endthread(), exit and _exit() are explicitly synchronized. We allow simultaneous calls to _endthread() by multiple threads. However, at the time exit() or _exit() is called, no calls to _endthread() are allowed. Some instrumentation added with JDK-8055338 remain in the code to help diagnose the failures if they still occur. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8057744 WEBREV: http://cr.openjdk.java.net/~igerasim/8057744/0/webrev/ As per preliminary discussions this all looks okay. One small concern I have is that the exit() may be arbitrarily delayed if the mutexes are not assigned fairly, and the application (more likely a test) is creating shortlived threads concurrently with the exit attempt. But I guess we will just have to deal with that if it arises. Thanks, David Sincerely yours, Ivan
Re: RFR [8055338]: (process) Add instrumentation to help diagnose JDK-6573254
Thanks for review! Could someone please sponsor it for me? I'm not sure what the correct repository is. Sincerely yours, Ivan On 21.08.2014 11:46, serguei.spit...@oracle.com wrote: +1 On 8/20/14 10:57 PM, Staffan Larsen wrote: Looks good to me. Let’s see what it uncovers. /Staffan On 20 aug 2014, at 21:36, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hello everyone! Here's the third version of the webrev: http://cr.openjdk.java.net/~igerasim/8055338/2/webrev/ The control build of the previous one was causing a lot of test failures. This one seems to be innocent enough: no new test failures so far. Additionally, this version keeps the timing around the thread exit close to original, which might be important if we deal with a race. Sincerely yours, Ivan
Re: RFR [8055338]: (process) Add instrumentation to help diagnose JDK-6573254
Hello everyone! Here's the third version of the webrev: http://cr.openjdk.java.net/~igerasim/8055338/2/webrev/ The control build of the previous one was causing a lot of test failures. This one seems to be innocent enough: no new test failures so far. Additionally, this version keeps the timing around the thread exit close to original, which might be important if we deal with a race. Sincerely yours, Ivan
Re: RFR [8055338]: (process) Add instrumentation to help diagnose JDK-6573254
Hello again! I updated the patch to cover situations when the exiting thread isn't daemon. I also added load_acquire/store_release for the sake of accuracy, even though on Windows they seem to add nothing to the volatile access. http://cr.openjdk.java.net/~igerasim/8055338/1/webrev/ If the updated patch looks Okay, I'll need a sponsor to push it. Sincerely yours, Ivan On 19.08.2014 6:42, David Holmes wrote: On 19/08/2014 10:12 AM, Ioi Lam wrote: With the Windows/x86/x64 memory model, is the write to vm_getting_terminated guaranteed to be observable by java_start()? In the general case, not immediately. For the threads actually of interest the logic that tells the threads to terminate happens after the write to the flag, and that logic contains sufficient synchronization that if the termination logic is correct then the flag must also be visible. David - Ioi On 8/18/14, 2:19 PM, Ivan Gerasimov wrote: Hello! This is a request to temporarily add some instrumentation code to hotspot to help diagnose the intermittent failure on Windows, which results in a wrong exit code of (sub-)process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8055338 WEBREV: http://cr.openjdk.java.net/~igerasim/8055338/0/webrev/ Sincerely yours, Ivan
Re: RFR [8055338]: (process) Add instrumentation to help diagnose JDK-6573254
This is a sub-task and it inherits its security level from the parent bug. It appears that Jira doesn't allow editing the security level of sub-tasks. Sincerely yours, Ivan On 19.08.2014 17:16, Mario Torre wrote: Is there any reason why this link is only accessible if I log-in? Mario 2014-08-18 23:19 GMT+02:00 Ivan Gerasimov ivan.gerasi...@oracle.com: Hello! This is a request to temporarily add some instrumentation code to hotspot to help diagnose the intermittent failure on Windows, which results in a wrong exit code of (sub-)process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8055338 WEBREV: http://cr.openjdk.java.net/~igerasim/8055338/0/webrev/ Sincerely yours, Ivan
Re: RFR [8055338]: (process) Add instrumentation to help diagnose JDK-6573254
Thanks Kelly and David! On 19.08.2014 6:14, David Holmes wrote: On 19/08/2014 9:04 AM, Kelly O'Hair wrote: 454 if (vm_getting_terminated thread-is_Java_thread()) { 455 JavaThread* java_thread = (JavaThread*)thread; 456 if (java_thread java_lang_Thread::is_daemon(java_thread-threadObj())) { 457 return 70115; 458 } Seems like the check for java_thread being null on line 456 can never happen, because it was assigned thread which was dereferenced on line 454. Maybe on line 454 you should check to make sure thread is not null and skip the check for java_thread being null? Neither thread nor java_thread can be NULL. Yes, I'll remove this unnecessary check. Sincerely yours, Ivan Just comments from the peanut gallery. ;) Ditto :) David H. -kto On Aug 18, 2014, at 3:14 PM, Daniel D. Daugherty wrote: On 8/18/14 3:19 PM, Ivan Gerasimov wrote: Hello! This is a request to temporarily add some instrumentation code to hotspot to help diagnose the intermittent failure on Windows, which results in a wrong exit code of (sub-)process. BUGURL: https://bugs.openjdk.java.net/browse/JDK-8055338 WEBREV: http://cr.openjdk.java.net/~igerasim/8055338/0/webrev/ src/os/windows/vm/os_windows.cpp No comments. src/share/vm/runtime/java.cpp No comments. Good luck with the hunt! Many of us have looked for this bug for years off and on... Dan Sincerely yours, Ivan
Re: JDK-8041679 Replace uses of StringBuffer with StringBuilder within the JDK
Wouldn't it be a little bit more efficient to replace a string concatenation with yet another StringBuilder operation? src/share/classes/com/sun/java/util/jar/pack/BandStructure.java 631 StringBuilder sb = new StringBuilder(); ... 636 Utils.log.fine( meta-coding +sb); = 631 StringBuilder sb = new StringBuilder( meta-coding ); ... 636 Utils.log.fine(sb); and 759 StringBuilder sb = new StringBuilder(); ... 764 ps.print( //header: +sb); = 759 StringBuilder sb = new StringBuilder( //header: ); ... 764 ps.print(sb); Sincerely yours, Ivan On 12.05.2014 14:03, Paul Sandoz wrote: Hi, This is a request for review of Otavio's patch replacing StringBuffer with StringBuilder within OpenJDK. (I also need to review it.) It covers many areas and i have grouped the patches into such areas to aid reviewing. When commenting please including core-libs. Jtreg tests showed no regressions, but when reviewing we need to be mindful of the context e.g. if the buffer escapes and can cross thread boundaries. This is also an experiment to see if we can review the whole thing under one bug :-) if things prove problematic and slow we can split it out. Many files are touched but there are not many changes to each file and changes are very formulaic. I have also included ASM related changes, but i suspect we may have to leave those alone since such changes will make it more difficult to pull in ASM from upstream. - Otavio, for the record can you reply to this thread posting your single (uber) patch as textual attachment? (IIUC such attachments should now be supported by the email server). Paul. - core http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-core/webrev/ - io http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-io/webrev/ - management http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-management/webrev/ - rmi http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-rmi/webrev/ - security http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-security/webrev/ - tools http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-tools/webrev/ - graphics/media http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-media/webrev/ - asm http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-asm/webrev/
Re: JDK-8041679 Replace uses of StringBuffer with StringBuilder within the JDK
src/share/classes/sun/misc/UUDecoder.java 126 StringBuilder x = new StringBuilder(); Is only filled, but doesn't seem to be used anyhow. Maybe just delete it? Sincerely yours, Ivan On 12.05.2014 14:03, Paul Sandoz wrote: Hi, This is a request for review of Otavio's patch replacing StringBuffer with StringBuilder within OpenJDK. (I also need to review it.) It covers many areas and i have grouped the patches into such areas to aid reviewing. When commenting please including core-libs. Jtreg tests showed no regressions, but when reviewing we need to be mindful of the context e.g. if the buffer escapes and can cross thread boundaries. This is also an experiment to see if we can review the whole thing under one bug :-) if things prove problematic and slow we can split it out. Many files are touched but there are not many changes to each file and changes are very formulaic. I have also included ASM related changes, but i suspect we may have to leave those alone since such changes will make it more difficult to pull in ASM from upstream. - Otavio, for the record can you reply to this thread posting your single (uber) patch as textual attachment? (IIUC such attachments should now be supported by the email server). Paul. - core http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-core/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-core/webrev/ - io http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-io/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-io/webrev/ - management http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-management/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-management/webrev/ - rmi http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-rmi/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-rmi/webrev/ - security http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-security/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-security/webrev/ - tools http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-tools/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-tools/webrev/ - graphics/media http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-media/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-media/webrev/ - asm http://cr.openjdk.java.net/~psandoz/jdk9/sb/JDK-8041679-buffer-to-builder-asm/webrev/ http://cr.openjdk.java.net/%7Epsandoz/jdk9/sb/JDK-8041679-buffer-to-builder-asm/webrev/
RFR: [8040790] TEST_BUG: tools/javac/innerClassFile/Driver.sh fails to cleanup files after it
Hello! This is a 7u-only test-stabilization one-line fix. The test was reported to not clean after itself, which sometimes causes problems. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8040790 WEBREV: http://cr.openjdk.java.net/~igerasim/8040790/0/webrev/ Sincerely yours, Ivan
Re: RFR: [8040790] TEST_BUG: tools/javac/innerClassFile/Driver.sh fails to cleanup files after it
Thanks! On 29.04.2014 14:49, Lance @ Oracle wrote: This looks ok http://oracle.com/us/design/oracle-email-sig-198324.gifLance Andersen| Principal Member of Technical Staff | +1.781.442.2037 tel:+1.781.442.2037 Oracle Java Engineering 1 Network Drive x-apple-data-detectors://34/0 Burlington, MA 01803 x-apple-data-detectors://34/0 lance.ander...@oracle.com mailto:lance.ander...@oracle.com Sent from my iPad On Apr 29, 2014, at 4:48 AM, Ivan Gerasimov ivan.gerasi...@oracle.com mailto:ivan.gerasi...@oracle.com wrote: Hello! This is a 7u-only test-stabilization one-line fix. The test was reported to not clean after itself, which sometimes causes problems. Would you please help review the fix? BUGURL: https://bugs.openjdk.java.net/browse/JDK-8040790 WEBREV: http://cr.openjdk.java.net/~igerasim/8040790/0/webrev/ http://cr.openjdk.java.net/%7Eigerasim/8040790/0/webrev/ Sincerely yours, Ivan
Re: RFR: 8039173 Propagate errors from Diagnostic Commands as exceptions in the attach framework
Hi Staffan! I think there is a couple of minor bugs in getErrorMessage(InputStream sis, int maxlen). 1) If maxlen is exactly the size of the message to read, the function will add an ellipsis, even though the message isn't truncated, 2) If maxlen is greater than needed, then sis.read(b, off, len) at the line #271 will eventually return -1, and it will cause the message to lose its last character. Sincerely yours, Ivan
Re: RFR: 8039173 Propagate errors from Diagnostic Commands as exceptions in the attach framework
Thank you Staffan for fixing them! But I'm afraid that now the function will never add ellipsis to the message, even if it gets truncated. Sincerely yours, Ivan On 04.04.2014 15:47, Staffan Larsen wrote: Thanks for finding these bugs, Ivan! I have updated the webrev at: http://cr.openjdk.java.net/~sla/8039173/webrev.01/, and I have also included the diff below. The updated webrev also has some changes in the javadoc for VirtualMachine to clarify that some methods can now throw AttachOperationFailedException. Thanks, /Staffan diff --git a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java --- a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java +++ b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java @@ -266,18 +266,21 @@ */ String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen]; -int n, off = 0, len = b.length; +int n, off = 0, len = maxlen; do { n = sis.read(b, off, len); +if (n == -1) { +break; +} off += n; len -= n; -} while (n = 0 off b.length); +} while (off maxlen); String message = null; if (off 0) { message = new String(b, 0, off, UTF-8); } -if (off == b.length message != null) { +if (off b.length message != null) { message += ...; } return message; On 4 apr 2014, at 11:18, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hi Staffan! I think there is a couple of minor bugs in getErrorMessage(InputStream sis, int maxlen). 1) If maxlen is exactly the size of the message to read, the function will add an ellipsis, even though the message isn't truncated, 2) If maxlen is greater than needed, then sis.read(b, off, len) at the line #271 will eventually return -1, and it will cause the message to lose its last character. Sincerely yours, Ivan
Re: RFR: 8039173 Propagate errors from Diagnostic Commands as exceptions in the attach framework
Now you reintroduced the smallish issue, when the message is exactly 200 characters (or whatever maxlen is). The dots will be appended to the message, even though it's not necessary. I think the only reliable way to deal with it is to try to read one extra character from sis. Something like this should do: String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen + 1]; int n, off = 0, len = b.length; do { n = sis.read(b, off, len); if (n == -1) { break; } off += n; len -= n; } while (off maxlen); String message = null; if (off 0) { message = (off maxlen) ? new String(b, 0, maxlen, UTF-8) + ... : new String(b, 0, off, UTF-8); } return message; } Sincerely yours, Ivan On 04.04.2014 16:08, Staffan Larsen wrote: I’m afraid you are right! Doh. Need to add more testing... How about this change: --- a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java +++ b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java @@ -267,9 +267,11 @@ String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen]; int n, off = 0, len = maxlen; +boolean complete = false; do { n = sis.read(b, off, len); if (n == -1) { +complete = true; break; } off += n; @@ -280,7 +282,7 @@ if (off 0) { message = new String(b, 0, off, UTF-8); } -if (off b.length message != null) { +if (!complete message != null) { message += ...; } return message; On 4 apr 2014, at 13:55, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Thank you Staffan for fixing them! But I'm afraid that now the function will never add ellipsis to the message, even if it gets truncated. Sincerely yours, Ivan On 04.04.2014 15:47, Staffan Larsen wrote: Thanks for finding these bugs, Ivan! I have updated the webrev at: http://cr.openjdk.java.net/~sla/8039173/webrev.01/, and I have also included the diff below. The updated webrev also has some changes in the javadoc for VirtualMachine to clarify that some methods can now throw AttachOperationFailedException. Thanks, /Staffan diff --git a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java --- a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java +++ b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java @@ -266,18 +266,21 @@ */ String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen]; -int n, off = 0, len = b.length; +int n, off = 0, len = maxlen; do { n = sis.read(b, off, len); +if (n == -1) { +break; +} off += n; len -= n; -} while (n = 0 off b.length); +} while (off maxlen); String message = null; if (off 0) { message = new String(b, 0, off, UTF-8); } -if (off == b.length message != null) { +if (off b.length message != null) { message += ...; } return message; On 4 apr 2014, at 11:18, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hi Staffan! I think there is a couple of minor bugs in getErrorMessage(InputStream sis, int maxlen). 1) If maxlen is exactly the size of the message to read, the function will add an ellipsis, even though the message isn't truncated, 2) If maxlen is greater than needed, then sis.read(b, off, len) at the line #271 will eventually return -1, and it will cause the message to lose its last character. Sincerely yours, Ivan
Re: RFR: 8039173 Propagate errors from Diagnostic Commands as exceptions in the attach framework
An alternative, more compact variant might be String getErrorMessage(InputStream sis, int maxlen) throws IOException { int n, off = 0, len = maxlen + 1; byte b[] = new byte[len]; while ((n = sis.read(b, off, len - off)) 0) off += n; return (off == 0) ? null : (off len) ? new String(b, 0, off, UTF-8) : new String(b, 0, maxlen, UTF-8) + ...; } Not a big deal, of course. Sincerely yours, Ivan On 04.04.2014 16:24, Ivan Gerasimov wrote: Now you reintroduced the smallish issue, when the message is exactly 200 characters (or whatever maxlen is). The dots will be appended to the message, even though it's not necessary. I think the only reliable way to deal with it is to try to read one extra character from sis. Something like this should do: String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen + 1]; int n, off = 0, len = b.length; do { n = sis.read(b, off, len); if (n == -1) { break; } off += n; len -= n; } while (off maxlen); String message = null; if (off 0) { message = (off maxlen) ? new String(b, 0, maxlen, UTF-8) + ... : new String(b, 0, off, UTF-8); } return message; } Sincerely yours, Ivan On 04.04.2014 16:08, Staffan Larsen wrote: I’m afraid you are right! Doh. Need to add more testing... How about this change: --- a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java +++ b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java @@ -267,9 +267,11 @@ String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen]; int n, off = 0, len = maxlen; +boolean complete = false; do { n = sis.read(b, off, len); if (n == -1) { +complete = true; break; } off += n; @@ -280,7 +282,7 @@ if (off 0) { message = new String(b, 0, off, UTF-8); } -if (off b.length message != null) { +if (!complete message != null) { message += ...; } return message; On 4 apr 2014, at 13:55, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Thank you Staffan for fixing them! But I'm afraid that now the function will never add ellipsis to the message, even if it gets truncated. Sincerely yours, Ivan On 04.04.2014 15:47, Staffan Larsen wrote: Thanks for finding these bugs, Ivan! I have updated the webrev at: http://cr.openjdk.java.net/~sla/8039173/webrev.01/, and I have also included the diff below. The updated webrev also has some changes in the javadoc for VirtualMachine to clarify that some methods can now throw AttachOperationFailedException. Thanks, /Staffan diff --git a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java --- a/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java +++ b/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java @@ -266,18 +266,21 @@ */ String getErrorMessage(InputStream sis, int maxlen) throws IOException { byte b[] = new byte[maxlen]; -int n, off = 0, len = b.length; +int n, off = 0, len = maxlen; do { n = sis.read(b, off, len); +if (n == -1) { +break; +} off += n; len -= n; -} while (n = 0 off b.length); +} while (off maxlen); String message = null; if (off 0) { message = new String(b, 0, off, UTF-8); } -if (off == b.length message != null) { +if (off b.length message != null) { message += ...; } return message; On 4 apr 2014, at 11:18, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hi Staffan! I think there is a couple of minor bugs in getErrorMessage(InputStream sis, int maxlen). 1) If maxlen is exactly the size of the message to read, the function will add an ellipsis, even though the message isn't truncated, 2) If maxlen is greater than needed, then sis.read(b, off, len) at the line #271 will eventually return -1, and it will cause the message to lose its last character. Sincerely yours, Ivan
RFR [8030878] JConsole issues meaningless message if SSL connection fails
Hello! This is a 7u-dev fix only. The issue doesn't exist in 8 nor in 9. Currently when the Jconsole app fails to establish a secured connection it displays a message box with two lines: ConnectionFailedSSL1 ConnectionFailedSSL2 which have a little sense. The fix is to place the correct lines into the massage.properties file. I also renamed the constants to make their names match those in the later releases. Here's the webrev, please have a chance to review. http://cr.openjdk.java.net/~igerasim/8030878/0/webrev/ Sincerely yours, Ivan Gerasimov
RFR [8030698] Several GUI labels in jconsole need corrections
Hello! Would you please review a GUI-only fix for jconsole? It is to fix the following 3 issues: 1) on the 'New Connection' dialog, the label is: 'lt;hostnamegt;...' instead of 'hostname'. 2) on the Overview page, Thread tab, the label is the constant string 'ThreadTab.infoLabelFormat'. 3) on the VM Summary page, Pending finalization displays constant string '{0} objects'. All of them seem to have been introduced with the huge update JDK-7017818. Would you please help review the proposed fix? http://cr.openjdk.java.net/~igerasim/8030698/0/webrev/ The localized messages were taken from these files: http://hg.openjdk.java.net/jdk7u/jdk7u-dev/jdk/file/2dd7fb82f40e/src/share/classes/sun/tools/jconsole/resources/JConsoleResources_ja.java http://hg.openjdk.java.net/jdk7u/jdk7u-dev/jdk/file/2dd7fb82f40e/src/share/classes/sun/tools/jconsole/resources/JConsoleResources_zh_CN.java If approved, the fix will be pushed into 9 and then be backported into 8 and 7. Sincerely yours, Ivan Gerasimov
Re: RFR [8030698] Several GUI labels in jconsole need corrections
Thank you Staffan! On 20.12.2013 16:07, Staffan Larsen wrote: Looks good! I assume this is targeted for jdk9 and/or 8u20? Yes, it's for 9, 8u20 and 7u(80?). Ivan Thanks for fixing this, /Staffan On 20 dec 2013, at 12:05, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hello! Would you please review a GUI-only fix for jconsole? It is to fix the following 3 issues: 1) on the 'New Connection' dialog, the label is: 'lt;hostnamegt;...' instead of 'hostname'. 2) on the Overview page, Thread tab, the label is the constant string 'ThreadTab.infoLabelFormat'. 3) on the VM Summary page, Pending finalization displays constant string '{0} objects'. All of them seem to have been introduced with the huge update JDK-7017818. Would you please help review the proposed fix? http://cr.openjdk.java.net/~igerasim/8030698/0/webrev/ The localized messages were taken from these files: http://hg.openjdk.java.net/jdk7u/jdk7u-dev/jdk/file/2dd7fb82f40e/src/share/classes/sun/tools/jconsole/resources/JConsoleResources_ja.java http://hg.openjdk.java.net/jdk7u/jdk7u-dev/jdk/file/2dd7fb82f40e/src/share/classes/sun/tools/jconsole/resources/JConsoleResources_zh_CN.java If approved, the fix will be pushed into 9 and then be backported into 8 and 7. Sincerely yours, Ivan Gerasimov
Re: RFR [8025886] replace == and [[ bash extensions
Hello all! Any chance to have this simple fix approved? The proposal is to get rid of == and [[ bash extensions, as this is the only place they are used in jdk regtests and sh-shell is not happy with them. Thanks in advance, Ivan Gerasimov --- a/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh +++ b/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh @@ -34,12 +34,13 @@ OS=`uname -s` UMASK=`umask` -if [[ $OS == CYGWIN_NT* ]] ; then +case $OS in +CYGWIN_NT*) OS=Windows_NT if [ -z $SystemRoot ] ; then -SystemRoot=$SYSTEMROOT +SystemRoot=$SYSTEMROOT fi -fi +esac case $OS in SunOS | Linux | Darwin) On 09.10.2013 0:34, Ivan Gerasimov wrote: Thanks, Dmitry! I assume I still need an approval from the Reviewer. Sincerely yours, Ivan On 05.10.2013 21:30, Dmitry Samersoff wrote: Ivan, Looks good for me. -Dmitry On 2013-10-05 17:04, Ivan Gerasimov wrote: Dmitry, thanks for suggestion! Yes, == comparison isn't the only sh-incompatible thing in the script. Sh may be unhappy with [[ as well. So I replaced it with case as you suggested. Grep shows that it was the only place where [[ and == were used in regtests, so it would be good to make things consistent. Please find a new patch below. Sincerely yours, Ivan --- a/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh +++ b/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh @@ -34,12 +34,13 @@ OS=`uname -s` UMASK=`umask` -if [[ $OS == CYGWIN_NT* ]] ; then +case $OS in +CYGWIN_NT*) OS=Windows_NT if [ -z $SystemRoot ] ; then -SystemRoot=$SYSTEMROOT +SystemRoot=$SYSTEMROOT fi -fi +esac case $OS in SunOS | Linux | Darwin) On 04.10.2013 15:34, Dmitry Samersoff wrote: Ivan, If you need shell pattern match CYGWIN_NT* it's better to use case but not if -Dmitry
Re: RFR [8025886] typo in shell regtest == instead of =
Thanks, Dmitry! I assume I still need an approval from the Reviewer. Sincerely yours, Ivan On 05.10.2013 21:30, Dmitry Samersoff wrote: Ivan, Looks good for me. -Dmitry On 2013-10-05 17:04, Ivan Gerasimov wrote: Dmitry, thanks for suggestion! Yes, == comparison isn't the only sh-incompatible thing in the script. Sh may be unhappy with [[ as well. So I replaced it with case as you suggested. Grep shows that it was the only place where [[ and == were used in regtests, so it would be good to make things consistent. Please find a new patch below. Sincerely yours, Ivan --- a/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh +++ b/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh @@ -34,12 +34,13 @@ OS=`uname -s` UMASK=`umask` -if [[ $OS == CYGWIN_NT* ]] ; then +case $OS in +CYGWIN_NT*) OS=Windows_NT if [ -z $SystemRoot ] ; then -SystemRoot=$SYSTEMROOT +SystemRoot=$SYSTEMROOT fi -fi +esac case $OS in SunOS | Linux | Darwin) On 04.10.2013 15:34, Dmitry Samersoff wrote: Ivan, If you need shell pattern match CYGWIN_NT* it's better to use case but not if -Dmitry
Re: RFR [8025886] typo in shell regtest == instead of =
Dmitry, thanks for suggestion! Yes, == comparison isn't the only sh-incompatible thing in the script. Sh may be unhappy with [[ as well. So I replaced it with case as you suggested. Grep shows that it was the only place where [[ and == were used in regtests, so it would be good to make things consistent. Please find a new patch below. Sincerely yours, Ivan --- a/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh +++ b/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh @@ -34,12 +34,13 @@ OS=`uname -s` UMASK=`umask` -if [[ $OS == CYGWIN_NT* ]] ; then +case $OS in +CYGWIN_NT*) OS=Windows_NT if [ -z $SystemRoot ] ; then -SystemRoot=$SYSTEMROOT +SystemRoot=$SYSTEMROOT fi -fi +esac case $OS in SunOS | Linux | Darwin) On 04.10.2013 15:34, Dmitry Samersoff wrote: Ivan, If you need shell pattern match CYGWIN_NT* it's better to use case but not if -Dmitry
RFR [8025886] typo in shell regtest == instead of =
Hello! May I please have a review for a simple fix of the shell regtest? Bash seems to accept both = and == comparisons, but sh accepts only =. http://bugs.sun.com/view_bug.do?bug_id=8025886 (not yet visible.) Sincerely yours, Ivan --- a/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh +++ b/test/sun/management/jmxremote/bootstrap/GeneratePropertyPassword.sh @@ -34,7 +34,7 @@ OS=`uname -s` UMASK=`umask` -if [[ $OS == CYGWIN_NT* ]] ; then +if [[ $OS = CYGWIN_NT* ]] ; then OS=Windows_NT if [ -z $SystemRoot ] ; then SystemRoot=$SYSTEMROOT
Re: RFR [8014792] Need a test to check if path to java debugger has spaces
Hi Staffan! On 25.09.2013 2:30, Staffan Larsen wrote: Ivan, I'm not sure this test adds anything to the current testing. As far as I understand, the test checks if jdb is in a location that contains space and then tries to start it. If the path does not contain space it does nothing. We already have many tests that launch jdb, regardless of space in the path, and it thus looks to me like this new test does not add anything to the testing. Yes, you're right. I've just checked jdk8-b90 (last build prior the fix for 8014676), and a bunch of tests failed with the appropriate messages. Here're are some to name a few: com/sun/jdi/connect/spi/JdiLoadedByCustomLoader.sh com/sun/jdi/connect/spi/DebugUsingCustomConnector.java com/sun/jdi/BadHandshakeTest.java com/sun/jdi/DoubleAgentTest.java com/sun/jdi/ExclusiveBind.java I only wander why the regression hadn't been caught immediately. Thus I'm withdrawing the request and closing the bug as Won't fix. Sincerely yours, Ivan If you want to really verify the bug you need to move jdb into a location that actually has a space in it and then launch it. I agree that it would be good to have, though I think it's better to set it up in the testing environment. Please let me know if I misunderstood your test. Thanks, /Staffan On 24 sep 2013, at 08:12, Ivan Gerasimov ivan.gerasi...@oracle.com wrote: Hello all! Would you please help review a new regression test? This is related to the fix for 8014676 (jdb could not run under Windows, if its path contained a space). The test simply checks, if the path to the tested jdk contains a space and exits, if it does not. Otherwise it tries to run the debugger. With the bug 8014676, the debugger will report IOExeption thrown. The complicated part is that with the presence of no error, the debugger will not exit, but wait for the user command instead. To prevent it, I pass a nonexistent option to the debuger. Thus it must fail with an error message. Here's the webrev: http://cr.openjdk.java.net/~igerasim/8014792/0/webrev/ Here's the bug: http://bugs.sun.com/view_bug.do?bug_id=8014792 Here's the fix for 8014676: http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/81c449fd18fe Sincerely yours, Ivan Gerasimov
RFR [8014792] Need a test to check if path to java debugger has spaces
Hello all! Would you please help review a new regression test? This is related to the fix for 8014676 (jdb could not run under Windows, if its path contained a space). The test simply checks, if the path to the tested jdk contains a space and exits, if it does not. Otherwise it tries to run the debugger. With the bug 8014676, the debugger will report IOExeption thrown. The complicated part is that with the presence of no error, the debugger will not exit, but wait for the user command instead. To prevent it, I pass a nonexistent option to the debuger. Thus it must fail with an error message. Here's the webrev: http://cr.openjdk.java.net/~igerasim/8014792/0/webrev/ Here's the bug: http://bugs.sun.com/view_bug.do?bug_id=8014792 Here's the fix for 8014676: http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/81c449fd18fe Sincerely yours, Ivan Gerasimov
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Hello Daniel! Can we please move forward with this issue? I've just checked how the tests run against the latest jdk build (which is 99), and the leak is still there. Thus the proposed change (including ProblemList modifications) is still actual. Here's a link to the latest webrev: http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ I've also run jprt job to check how it works on all other platforms: http://prt-web.us.oracle.com//archive/2013/07/2013-07-22-143846.igerasim.jdk/logs/ On all the platforms (including 32-bit Linux) the tests passed as expected, on the Linux-x64 both tests were skipped. Virtual memory growth across {fastdebug,product}-{c1,c2} targets was in range from 1188K to 2584K which is less than the chosen threshold of 32M. Sincerely yours, Ivan Gerasimov On 10.07.2013 14:38, Ivan Gerasimov wrote: Yes, I forgot to include the most important thing - a link to webrev! Your link is correct. http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ The tests fail on linux-x64 only, linux-i586 is fine. Here's the link to the logs of the tests run by Daniel Daugherty: http://prt-web.us.oracle.com//archive/2013/07/2013-07-05-045326.ddaugher.8016838_exp/logs/ Thus the memory leak seems to be specific to 64-bit Linux. Sincerely yours, Ivan On 10.07.2013 13:15, Seán Coffey wrote: Ivan, I'll assume this is the latest webrev : http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ http://cr.openjdk.java.net/%7Eigerasim/8016838/3/webrev/ One comment - should the test be excluded for all linux variants (i.e. linux-all) ? regards, Sean. On 09/07/2013 14:09, Ivan Gerasimov wrote: Please have a chance to review an updated webrev. It now includes a change to ProblemList.txt, so both modified tests are ignored for linux-x64. Sincerely yours, Ivan Gersimov On 08.07.2013 21:27, Seán Coffey wrote: On 08/07/13 17:55, Ivan Gerasimov wrote: Thanks, Seán! I located the build in which the memleak was first introduced -- it is jdk8-b69 (hs25-b13). I've updated the bug http://bugs.sun.com/view_bug.do?bug_id=8019845 with this. So what is the correct procedure to go forward now? Should I update the webrev to include changes to the problem list? I believe I shouldn't -- this list seems to be a sensitive stuff. I'd suggest updating the webrev with the ProblemList modification/addition. It's best not to add a test to testruns if it's knowingly failing. The test can be removed from ProblemList when the jdk8 regression is fixed. regards, Sean. Sincerely yours, Ivan On 05.07.2013 15:45, Seán Coffey wrote: Nice work indeed Ivan. Good to have a reliable testcase to catch leaks in this area. I'd also suggest that this test goes on the ProblemList until the new leak is sorted out for jdk8. The goal of JPRT runs is to have clean runs. If it's on the problemList, then it's a known issue and is normally tagged with the relevant bug ID so that the responsible engineer knows the current state. regards, Sean. On 05/07/2013 11:53, Ivan Gerasimov wrote: On 05.07.2013 8:35, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? I've only run test from java/lang/instrument when checked the change with JDK6u60 (all passed) and with JDK6u51 (the test failed as expected, since the leak had still been there.) Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. Right. The test shown a memleak with the the latest jdk8. I filed a bug 8019845 about this leak. Stefan Karlsson guessed that this may be related to 8003419 (NPG: Clean up metadata created during class loading if failure) Then I've checked the builds b57 (test failed) and b58 (test passed), so I confirmed that it may be the reason of the leak being observed now. But now I think that the reason may be different. It just turns out that the test shows failures for (at least) three different leaks - the one you, Daniel, solved (7121600), the one Stefan wrote about (8003419) and the one currently observed (8019845). That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However, the TL folks might not like that either... Well, the leak is there, and why not have a failing test as a reminder to have it fixed? Sincerely yours, Ivan Gerasimov Dan On 7/4/13 6:59 PM, Ivan Gerasimov wrote: Thank you, Daniel! Please find an updated webrev at http://cr.openjdk.java.net/~igerasim/8016838/2/webrev/. It now
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
On 23.07.2013 18:50, Daniel D. Daugherty wrote: Yes. Do you have a pointer to the committed patch file? If so, I'll take care of getting the fix into JDK8-TL. Thanks a lot! I really appreciate it. Here's the link: http://cr.openjdk.java.net/~igerasim/2commit/8016838-jdk8-ReBigClass-improved.patch I've just updated the file to include the latest changes. Sincerely yours, Ivan Dan I've just checked how the tests run against the latest jdk build (which is 99), and the leak is still there. Thus the proposed change (including ProblemList modifications) is still actual. Here's a link to the latest webrev: http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ I've also run jprt job to check how it works on all other platforms: http://prt-web.us.oracle.com//archive/2013/07/2013-07-22-143846.igerasim.jdk/logs/ On all the platforms (including 32-bit Linux) the tests passed as expected, on the Linux-x64 both tests were skipped. Virtual memory growth across {fastdebug,product}-{c1,c2} targets was in range from 1188K to 2584K which is less than the chosen threshold of 32M. Sincerely yours, Ivan Gerasimov On 10.07.2013 14:38, Ivan Gerasimov wrote: Yes, I forgot to include the most important thing - a link to webrev! Your link is correct. http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ The tests fail on linux-x64 only, linux-i586 is fine. Here's the link to the logs of the tests run by Daniel Daugherty: http://prt-web.us.oracle.com//archive/2013/07/2013-07-05-045326.ddaugher.8016838_exp/logs/ Thus the memory leak seems to be specific to 64-bit Linux. Sincerely yours, Ivan On 10.07.2013 13:15, Seán Coffey wrote: Ivan, I'll assume this is the latest webrev : http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ http://cr.openjdk.java.net/%7Eigerasim/8016838/3/webrev/ One comment - should the test be excluded for all linux variants (i.e. linux-all) ? regards, Sean. On 09/07/2013 14:09, Ivan Gerasimov wrote: Please have a chance to review an updated webrev. It now includes a change to ProblemList.txt, so both modified tests are ignored for linux-x64. Sincerely yours, Ivan Gersimov On 08.07.2013 21:27, Seán Coffey wrote: On 08/07/13 17:55, Ivan Gerasimov wrote: Thanks, Seán! I located the build in which the memleak was first introduced -- it is jdk8-b69 (hs25-b13). I've updated the bug http://bugs.sun.com/view_bug.do?bug_id=8019845 with this. So what is the correct procedure to go forward now? Should I update the webrev to include changes to the problem list? I believe I shouldn't -- this list seems to be a sensitive stuff. I'd suggest updating the webrev with the ProblemList modification/addition. It's best not to add a test to testruns if it's knowingly failing. The test can be removed from ProblemList when the jdk8 regression is fixed. regards, Sean. Sincerely yours, Ivan On 05.07.2013 15:45, Seán Coffey wrote: Nice work indeed Ivan. Good to have a reliable testcase to catch leaks in this area. I'd also suggest that this test goes on the ProblemList until the new leak is sorted out for jdk8. The goal of JPRT runs is to have clean runs. If it's on the problemList, then it's a known issue and is normally tagged with the relevant bug ID so that the responsible engineer knows the current state. regards, Sean. On 05/07/2013 11:53, Ivan Gerasimov wrote: On 05.07.2013 8:35, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? I've only run test from java/lang/instrument when checked the change with JDK6u60 (all passed) and with JDK6u51 (the test failed as expected, since the leak had still been there.) Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. Right. The test shown a memleak with the the latest jdk8. I filed a bug 8019845 about this leak. Stefan Karlsson guessed that this may be related to 8003419 (NPG: Clean up metadata created during class loading if failure) Then I've checked the builds b57 (test failed) and b58 (test passed), so I confirmed that it may be the reason of the leak being observed now. But now I think that the reason may be different. It just turns out that the test shows failures for (at least) three different leaks - the one you, Daniel, solved (7121600), the one Stefan wrote about (8003419) and the one currently observed (8019845). That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Yes, I forgot to include the most important thing - a link to webrev! Your link is correct. http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ The tests fail on linux-x64 only, linux-i586 is fine. Here's the link to the logs of the tests run by Daniel Daugherty: http://prt-web.us.oracle.com//archive/2013/07/2013-07-05-045326.ddaugher.8016838_exp/logs/ Thus the memory leak seems to be specific to 64-bit Linux. Sincerely yours, Ivan On 10.07.2013 13:15, Seán Coffey wrote: Ivan, I'll assume this is the latest webrev : http://cr.openjdk.java.net/~igerasim/8016838/3/webrev/ http://cr.openjdk.java.net/%7Eigerasim/8016838/3/webrev/ One comment - should the test be excluded for all linux variants (i.e. linux-all) ? regards, Sean. On 09/07/2013 14:09, Ivan Gerasimov wrote: Please have a chance to review an updated webrev. It now includes a change to ProblemList.txt, so both modified tests are ignored for linux-x64. Sincerely yours, Ivan Gersimov On 08.07.2013 21:27, Seán Coffey wrote: On 08/07/13 17:55, Ivan Gerasimov wrote: Thanks, Seán! I located the build in which the memleak was first introduced -- it is jdk8-b69 (hs25-b13). I've updated the bug http://bugs.sun.com/view_bug.do?bug_id=8019845 with this. So what is the correct procedure to go forward now? Should I update the webrev to include changes to the problem list? I believe I shouldn't -- this list seems to be a sensitive stuff. I'd suggest updating the webrev with the ProblemList modification/addition. It's best not to add a test to testruns if it's knowingly failing. The test can be removed from ProblemList when the jdk8 regression is fixed. regards, Sean. Sincerely yours, Ivan On 05.07.2013 15:45, Seán Coffey wrote: Nice work indeed Ivan. Good to have a reliable testcase to catch leaks in this area. I'd also suggest that this test goes on the ProblemList until the new leak is sorted out for jdk8. The goal of JPRT runs is to have clean runs. If it's on the problemList, then it's a known issue and is normally tagged with the relevant bug ID so that the responsible engineer knows the current state. regards, Sean. On 05/07/2013 11:53, Ivan Gerasimov wrote: On 05.07.2013 8:35, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? I've only run test from java/lang/instrument when checked the change with JDK6u60 (all passed) and with JDK6u51 (the test failed as expected, since the leak had still been there.) Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. Right. The test shown a memleak with the the latest jdk8. I filed a bug 8019845 about this leak. Stefan Karlsson guessed that this may be related to 8003419 (NPG: Clean up metadata created during class loading if failure) Then I've checked the builds b57 (test failed) and b58 (test passed), so I confirmed that it may be the reason of the leak being observed now. But now I think that the reason may be different. It just turns out that the test shows failures for (at least) three different leaks - the one you, Daniel, solved (7121600), the one Stefan wrote about (8003419) and the one currently observed (8019845). That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However, the TL folks might not like that either... Well, the leak is there, and why not have a failing test as a reminder to have it fixed? Sincerely yours, Ivan Gerasimov Dan On 7/4/13 6:59 PM, Ivan Gerasimov wrote: Thank you, Daniel! Please find an updated webrev at http://cr.openjdk.java.net/~igerasim/8016838/2/webrev/. It now includes the RetransformBigClass test modified in the same way as RedefineBigClass If the changes look fine, may I ask you to sponsor the commit, as I'm not a committer? Here's a link to hg export: http://cr.openjdk.java.net/~igerasim/2commit/8016838-jdk8-ReBigClass-improved.patch Thanks in advance, Ivan On 04.07.2013 21:45, Daniel D. Daugherty wrote: On 7/4/13 11:19 AM, Ivan Gerasimov wrote: Daniel, thank you for review! Here's the updated with all all your suggestions adopted. http://cr.openjdk.java.net/~igerasim/8016838/1/webrev/ Looks good. I haven't yet considered applying the approach to RetransformBigClass. Do you want me to include this into this same change set or should I make it separately? I would include it in the same changeset. Dan Sincerely yours, Ivan On 04.07.2013 19:34, Daniel D. Daugherty wrote
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Please have a chance to review an updated webrev. It now includes a change to ProblemList.txt, so both modified tests are ignored for linux-x64. Sincerely yours, Ivan Gersimov On 08.07.2013 21:27, Seán Coffey wrote: On 08/07/13 17:55, Ivan Gerasimov wrote: Thanks, Seán! I located the build in which the memleak was first introduced -- it is jdk8-b69 (hs25-b13). I've updated the bug http://bugs.sun.com/view_bug.do?bug_id=8019845 with this. So what is the correct procedure to go forward now? Should I update the webrev to include changes to the problem list? I believe I shouldn't -- this list seems to be a sensitive stuff. I'd suggest updating the webrev with the ProblemList modification/addition. It's best not to add a test to testruns if it's knowingly failing. The test can be removed from ProblemList when the jdk8 regression is fixed. regards, Sean. Sincerely yours, Ivan On 05.07.2013 15:45, Seán Coffey wrote: Nice work indeed Ivan. Good to have a reliable testcase to catch leaks in this area. I'd also suggest that this test goes on the ProblemList until the new leak is sorted out for jdk8. The goal of JPRT runs is to have clean runs. If it's on the problemList, then it's a known issue and is normally tagged with the relevant bug ID so that the responsible engineer knows the current state. regards, Sean. On 05/07/2013 11:53, Ivan Gerasimov wrote: On 05.07.2013 8:35, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? I've only run test from java/lang/instrument when checked the change with JDK6u60 (all passed) and with JDK6u51 (the test failed as expected, since the leak had still been there.) Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. Right. The test shown a memleak with the the latest jdk8. I filed a bug 8019845 about this leak. Stefan Karlsson guessed that this may be related to 8003419 (NPG: Clean up metadata created during class loading if failure) Then I've checked the builds b57 (test failed) and b58 (test passed), so I confirmed that it may be the reason of the leak being observed now. But now I think that the reason may be different. It just turns out that the test shows failures for (at least) three different leaks - the one you, Daniel, solved (7121600), the one Stefan wrote about (8003419) and the one currently observed (8019845). That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However, the TL folks might not like that either... Well, the leak is there, and why not have a failing test as a reminder to have it fixed? Sincerely yours, Ivan Gerasimov Dan On 7/4/13 6:59 PM, Ivan Gerasimov wrote: Thank you, Daniel! Please find an updated webrev at http://cr.openjdk.java.net/~igerasim/8016838/2/webrev/. It now includes the RetransformBigClass test modified in the same way as RedefineBigClass If the changes look fine, may I ask you to sponsor the commit, as I'm not a committer? Here's a link to hg export: http://cr.openjdk.java.net/~igerasim/2commit/8016838-jdk8-ReBigClass-improved.patch Thanks in advance, Ivan On 04.07.2013 21:45, Daniel D. Daugherty wrote: On 7/4/13 11:19 AM, Ivan Gerasimov wrote: Daniel, thank you for review! Here's the updated with all all your suggestions adopted. http://cr.openjdk.java.net/~igerasim/8016838/1/webrev/ Looks good. I haven't yet considered applying the approach to RetransformBigClass. Do you want me to include this into this same change set or should I make it separately? I would include it in the same changeset. Dan Sincerely yours, Ivan On 04.07.2013 19:34, Daniel D. Daugherty wrote: On 7/3/13 11:12 AM, Ivan Gerasimov wrote: Hello everybody! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ Very nicely done! The logic in RedefineBigClass.sh only catches a leak big enough to cause an Exception to be thrown. When I originally wrote this test and its companion: test/java/lang/instrument
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Thanks, Seán! I located the build in which the memleak was first introduced -- it is jdk8-b69 (hs25-b13). I've updated the bug http://bugs.sun.com/view_bug.do?bug_id=8019845 with this. So what is the correct procedure to go forward now? Should I update the webrev to include changes to the problem list? I believe I shouldn't -- this list seems to be a sensitive stuff. Sincerely yours, Ivan On 05.07.2013 15:45, Seán Coffey wrote: Nice work indeed Ivan. Good to have a reliable testcase to catch leaks in this area. I'd also suggest that this test goes on the ProblemList until the new leak is sorted out for jdk8. The goal of JPRT runs is to have clean runs. If it's on the problemList, then it's a known issue and is normally tagged with the relevant bug ID so that the responsible engineer knows the current state. regards, Sean. On 05/07/2013 11:53, Ivan Gerasimov wrote: On 05.07.2013 8:35, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? I've only run test from java/lang/instrument when checked the change with JDK6u60 (all passed) and with JDK6u51 (the test failed as expected, since the leak had still been there.) Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. Right. The test shown a memleak with the the latest jdk8. I filed a bug 8019845 about this leak. Stefan Karlsson guessed that this may be related to 8003419 (NPG: Clean up metadata created during class loading if failure) Then I've checked the builds b57 (test failed) and b58 (test passed), so I confirmed that it may be the reason of the leak being observed now. But now I think that the reason may be different. It just turns out that the test shows failures for (at least) three different leaks - the one you, Daniel, solved (7121600), the one Stefan wrote about (8003419) and the one currently observed (8019845). That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However, the TL folks might not like that either... Well, the leak is there, and why not have a failing test as a reminder to have it fixed? Sincerely yours, Ivan Gerasimov Dan On 7/4/13 6:59 PM, Ivan Gerasimov wrote: Thank you, Daniel! Please find an updated webrev at http://cr.openjdk.java.net/~igerasim/8016838/2/webrev/. It now includes the RetransformBigClass test modified in the same way as RedefineBigClass If the changes look fine, may I ask you to sponsor the commit, as I'm not a committer? Here's a link to hg export: http://cr.openjdk.java.net/~igerasim/2commit/8016838-jdk8-ReBigClass-improved.patch Thanks in advance, Ivan On 04.07.2013 21:45, Daniel D. Daugherty wrote: On 7/4/13 11:19 AM, Ivan Gerasimov wrote: Daniel, thank you for review! Here's the updated with all all your suggestions adopted. http://cr.openjdk.java.net/~igerasim/8016838/1/webrev/ Looks good. I haven't yet considered applying the approach to RetransformBigClass. Do you want me to include this into this same change set or should I make it separately? I would include it in the same changeset. Dan Sincerely yours, Ivan On 04.07.2013 19:34, Daniel D. Daugherty wrote: On 7/3/13 11:12 AM, Ivan Gerasimov wrote: Hello everybody! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ Very nicely done! The logic in RedefineBigClass.sh only catches a leak big enough to cause an Exception to be thrown. When I originally wrote this test and its companion: test/java/lang/instrument/RetransformBigClass* I could not come up with a platform independent way to put a small upper limit on memory growth. It never dawned on me that putting in something on one platform (Linux) was better than nothing. Are you planning to add this same logic to RetransformBigClass*? test/java/lang/instrument/RedefineBigClass.sh No comments. test/java/lang/instrument/RedefineBigClassApp.java line 48: final long MEM_LEAK_THRESHOLD = 32 * 1024; // 32Mb Would be better at the top of RedefineBigClassApp
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
I'm looking at the log of the job you've run: http://prt-web.us.oracle.com//archive/2013/07/2013-07-05-045326.ddaugher.8016838_exp/logs/linux_x64-product-c2-jdk_lang.log.FAILED.log And it looks like both tests failed, not only the first one: FAIL: Virtual memory usage increased by 1411072Kb (greater than 32768Kb) FAIL: RedefineBigClassApp exited with status of 1 ... FAIL: Virtual memory usage increased by 1411072Kb (greater than 32768Kb) FAIL: RetransformBigClassApp exited with status of 1 I have these two tests locally on 64-bit Linux and they both fail with similar results in logs. I may not tell for sure why the tests didn't fail on 32-bit Linux. Whether it was because the /proc/self/stat approach didn't work or because the leak is specific to 64-bit platform. I have tested the RedefineBigClass test on JPRT with JDK6 (the code was a bit different, but the approach was the same). The test passed with JDK6u60 and failed (as expected) with JDK6u51 on both 32 and 64-bit Linux machines. I'm going to do more testings today. Sincerely yours, Ivan On 05.07.2013 9:37, Daniel D. Daugherty wrote: In a JPRT test job I just ran: RedefineBigClass.sh passed on 9 platforms and failed on Linux 64-bit. RetransformBigClass.sh passed on all 10 platforms. Does your own testing only showing failure on the Linux 64-bit VM and passed on the Linux 32-bit VM? Dan On 7/4/13 10:35 PM, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However, the TL folks might not like that either... Dan On 7/4/13 6:59 PM, Ivan Gerasimov wrote: Thank you, Daniel! Please find an updated webrev at http://cr.openjdk.java.net/~igerasim/8016838/2/webrev/. It now includes the RetransformBigClass test modified in the same way as RedefineBigClass If the changes look fine, may I ask you to sponsor the commit, as I'm not a committer? Here's a link to hg export: http://cr.openjdk.java.net/~igerasim/2commit/8016838-jdk8-ReBigClass-improved.patch Thanks in advance, Ivan On 04.07.2013 21:45, Daniel D. Daugherty wrote: On 7/4/13 11:19 AM, Ivan Gerasimov wrote: Daniel, thank you for review! Here's the updated with all all your suggestions adopted. http://cr.openjdk.java.net/~igerasim/8016838/1/webrev/ Looks good. I haven't yet considered applying the approach to RetransformBigClass. Do you want me to include this into this same change set or should I make it separately? I would include it in the same changeset. Dan Sincerely yours, Ivan On 04.07.2013 19:34, Daniel D. Daugherty wrote: On 7/3/13 11:12 AM, Ivan Gerasimov wrote: Hello everybody! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ Very nicely done! The logic in RedefineBigClass.sh only catches a leak big enough to cause an Exception to be thrown. When I originally wrote this test and its companion: test/java/lang/instrument/RetransformBigClass* I could not come up with a platform independent way to put a small upper limit on memory growth. It never dawned on me that putting in something on one platform (Linux) was better than nothing. Are you planning to add this same logic to RetransformBigClass*? test/java/lang/instrument/RedefineBigClass.sh No comments. test/java/lang/instrument/RedefineBigClassApp.java line 48: final long MEM_LEAK_THRESHOLD = 32 * 1024; // 32Mb Would be better at the top of RedefineBigClassApp rather than buried down here. line 51: Long.valueOf(vmMemDelta) There are several places where a long is passed to Long.valueOf(). Is there some reason you're not passing them directly to println()? line 54: } else { The else is redundant due to the System.exit(1) call above it. You can drop the else { and } and shift that last println() to the left. line 71: return Long.parseLong
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
On 05.07.2013 8:35, Daniel D. Daugherty wrote: Ivan, The changes look fine, I can sponsor your commit, looks like your OpenJDK user name is 'igerasim', but I need to know a little bit more about your testing of these fixes. Did you do a test JPRT job to run the JLI tests (or just the two tests themselves)? I've only run test from java/lang/instrument when checked the change with JDK6u60 (all passed) and with JDK6u51 (the test failed as expected, since the leak had still been there.) Based on e-mail about this bug fix, I believe you've found a new leak in JDK8/HSX-25 with test java/lang/instrument/RedefineBigClass.sh. Right. The test shown a memleak with the the latest jdk8. I filed a bug 8019845 about this leak. Stefan Karlsson guessed that this may be related to 8003419 (NPG: Clean up metadata created during class loading if failure) Then I've checked the builds b57 (test failed) and b58 (test passed), so I confirmed that it may be the reason of the leak being observed now. But now I think that the reason may be different. It just turns out that the test shows failures for (at least) three different leaks - the one you, Daniel, solved (7121600), the one Stefan wrote about (8003419) and the one currently observed (8019845). That's a good thing, but I think Alan and company would be a bit grumpy with me if I pushed a test that failed as soon as someone ran it. :-) Do you know if the revised RetransformBigClass.sh also finds a new memory leak in JDK8/HSX-25? The way to deal with that is to put the test on the Problem.list as part of the same changeset. However, the TL folks might not like that either... Well, the leak is there, and why not have a failing test as a reminder to have it fixed? Sincerely yours, Ivan Gerasimov Dan On 7/4/13 6:59 PM, Ivan Gerasimov wrote: Thank you, Daniel! Please find an updated webrev at http://cr.openjdk.java.net/~igerasim/8016838/2/webrev/. It now includes the RetransformBigClass test modified in the same way as RedefineBigClass If the changes look fine, may I ask you to sponsor the commit, as I'm not a committer? Here's a link to hg export: http://cr.openjdk.java.net/~igerasim/2commit/8016838-jdk8-ReBigClass-improved.patch Thanks in advance, Ivan On 04.07.2013 21:45, Daniel D. Daugherty wrote: On 7/4/13 11:19 AM, Ivan Gerasimov wrote: Daniel, thank you for review! Here's the updated with all all your suggestions adopted. http://cr.openjdk.java.net/~igerasim/8016838/1/webrev/ Looks good. I haven't yet considered applying the approach to RetransformBigClass. Do you want me to include this into this same change set or should I make it separately? I would include it in the same changeset. Dan Sincerely yours, Ivan On 04.07.2013 19:34, Daniel D. Daugherty wrote: On 7/3/13 11:12 AM, Ivan Gerasimov wrote: Hello everybody! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ Very nicely done! The logic in RedefineBigClass.sh only catches a leak big enough to cause an Exception to be thrown. When I originally wrote this test and its companion: test/java/lang/instrument/RetransformBigClass* I could not come up with a platform independent way to put a small upper limit on memory growth. It never dawned on me that putting in something on one platform (Linux) was better than nothing. Are you planning to add this same logic to RetransformBigClass*? test/java/lang/instrument/RedefineBigClass.sh No comments. test/java/lang/instrument/RedefineBigClassApp.java line 48: final long MEM_LEAK_THRESHOLD = 32 * 1024; // 32Mb Would be better at the top of RedefineBigClassApp rather than buried down here. line 51: Long.valueOf(vmMemDelta) There are several places where a long is passed to Long.valueOf(). Is there some reason you're not passing them directly to println()? line 54: } else { The else is redundant due to the System.exit(1) call above it. You can drop the else { and } and shift that last println() to the left. line 71: return Long.parseLong(line.split( )[22]) / 1024; How about at least a comment referring to the Linux proc(5) man page... and the fact that the 23rd field is: vsize %lu Virtual memory size in bytes. Again, very nicely done! Dan The surprising thing is that modified test detected the leak with the latest JDK8! Latest 6 and 7 are fine, so it might be regression in JDK8. I've filled a bug http://bugs.sun.com/view_bug.do?bug_id
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Hello again! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ The surprising thing is that modified test detected the leak with the latest JDK8! Latest 6 and 7 are fine, so it might be regression in JDK8. I've filled a bug http://bugs.sun.com/view_bug.do?bug_id=8019845 Memory leak during class redefenition (not yet available.) As pointed out by Stefan Karlsson, the detected leak may not be related to the class redefinition and be related to [1] instead. However the webrev is still relevant, so welcome to review! [1] http://bugs.sun.com/view_bug.do?bug_id=8003419 Sincerely yours, Ivan Gerasimov
Re: RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Daniel, thank you for review! Here's the updated with all all your suggestions adopted. http://cr.openjdk.java.net/~igerasim/8016838/1/webrev/ I haven't yet considered applying the approach to RetransformBigClass. Do you want me to include this into this same change set or should I make it separately? Sincerely yours, Ivan On 04.07.2013 19:34, Daniel D. Daugherty wrote: On 7/3/13 11:12 AM, Ivan Gerasimov wrote: Hello everybody! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ Very nicely done! The logic in RedefineBigClass.sh only catches a leak big enough to cause an Exception to be thrown. When I originally wrote this test and its companion: test/java/lang/instrument/RetransformBigClass* I could not come up with a platform independent way to put a small upper limit on memory growth. It never dawned on me that putting in something on one platform (Linux) was better than nothing. Are you planning to add this same logic to RetransformBigClass*? test/java/lang/instrument/RedefineBigClass.sh No comments. test/java/lang/instrument/RedefineBigClassApp.java line 48: final long MEM_LEAK_THRESHOLD = 32 * 1024; // 32Mb Would be better at the top of RedefineBigClassApp rather than buried down here. line 51: Long.valueOf(vmMemDelta) There are several places where a long is passed to Long.valueOf(). Is there some reason you're not passing them directly to println()? line 54: } else { The else is redundant due to the System.exit(1) call above it. You can drop the else { and } and shift that last println() to the left. line 71: return Long.parseLong(line.split( )[22]) / 1024; How about at least a comment referring to the Linux proc(5) man page... and the fact that the 23rd field is: vsize %lu Virtual memory size in bytes. Again, very nicely done! Dan The surprising thing is that modified test detected the leak with the latest JDK8! Latest 6 and 7 are fine, so it might be regression in JDK8. I've filled a bug http://bugs.sun.com/view_bug.do?bug_id=8019845 Memory leak during class redefenition (not yet available.) Sincerely yours, Ivan Gerasimov
RFR [8016838] java/lang/instrument/RedefineBigClass.sh needs modification
Hello everybody! We have a request to improve jtreg test. The test had been written to verify fix for memory leak during class redefinition. The problem is that it always is reported as PASSED even in the presence of the leak. The proposed change is platform specific. It allows memory leak detection only on Linux. This is because the memory leak was in native code, so there's no easy way to detect it from within Java code. Here's the webrev for JDK8 repository: http://cr.openjdk.java.net/~igerasim/8016838/0/webrev/ The surprising thing is that modified test detected the leak with the latest JDK8! Latest 6 and 7 are fine, so it might be regression in JDK8. I've filled a bug http://bugs.sun.com/view_bug.do?bug_id=8019845 Memory leak during class redefenition (not yet available.) Sincerely yours, Ivan Gerasimov
RFR [8014676] Java debugger may fail to run
Hello everybody! Would you please help with reviewing the fix? The fix is for jdk8. It is also applicable to jdk7 and will be proposed, if it's accepted here. WEBREV: http://cr.openjdk.java.net/~robm/8014676/webrev.01/ http://cr.openjdk.java.net/%7Erobm/8014676/webrev.01/ BUG: http://bugs.sun.com/view_bug.do?bug_id=8014676 (may not be available yet) The problem is observed when the binaries for windows are placed under a path which contains a space. Sincerely, Ivan