On Tue, 7 Mar 2023 22:14:39 GMT, Patricio Chilano Mateo <pchilanom...@openjdk.org> wrote:
> Please review the following fix. The Method instance representing > Continuation.enterSpecial() is replaced by a new Method during redefinition > of the Continuation class. The already existing nmethod for it is not used, > but a new one will be generated the first time enterSpecial() is resolved > after redefinition. This means we could have more than one nmethod > representing enterSpecial(), in particular, one generated before redefinition > took place, and one after it. Now, when walking the stack, if we found a > return barrier pc (as in Continuation::is_return_barrier_entry()) and we want > to keep walking the physical stack then we know the sender will be the > enterSpecial frame so we create it by calling ContinuationEntry::to_frame(). > This method assumes there can only be one nmethod associated with > enterSpecial() so we hit an assert later on. See the bug for more details of > the crash. > > As I mentioned in the bug we don't need to rely on this assumption since we > can re-read the updated value from _enter_special. But reading both > _enter_special and _return_pc means we would need some kind of > synchronization since to_frame() could be called concurrently with > set_enter_code(). To avoid that we could just read _return_pc and calculate > the blob from it each time, but I'm also assuming that overhead is undesired > and that's why the static variable was introduced. Alternatively > _enter_special could be read and _return_pc could be derived from it (by > adding an extra field in the nmethod class). But if we go this route I think > we would need to do a small fix on thaw too. After redefinition and before a > new call to resolve enterSpecial(), the last thaw call for some virtual > thread would create an entry frame with an old _return_pc (see > ThawBase::new_entry_frame() and ThawBase::patch_return()). I'm not sure about > the lifecycle of the old CodeBlob but it seems it could have bee n already removed if enterSpecial was not found while traversing everybody's stack. Maybe there are more issues. > > The simple solution implemented here is just to disallow redefinition of the > Continuation class altogether. Another less restrictive option would be to > keep the already generated enterSpecial nmethod, if there is one. I can also > investigate one of the routes mentioned previously if desired. > > I tested the fix with the simple reproducer I added to the bug and also with > the previously crashing HelidonAppTest.java test. > > Thanks, > Patricio This pull request has now been integrated. Changeset: 8b740b46 Author: Patricio Chilano Mateo <pchilanom...@openjdk.org> URL: https://git.openjdk.org/jdk/commit/8b740b46091c853c7cb66c361deda6dfbb2cedc8 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod 8302779: HelidonAppTest.java fails with "assert(_cb == CodeCache::find_blob(pc())) failed: Must be the same" or SIGSEGV Reviewed-by: coleenp, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/12911