Hi Josh, I was afk for most of the last week, but others have been testing with a compiler build from that branch. The problem was encountered at least daily prior to using the branch build, and has not been observed since swapping to the build with this 'fix'.
So I think the basic assumption of the problem (as mentioned earlier in the thread) is correct, or at least close to being correct. And the changes in the branch 'fix' it. So when you get a chance can you please review the PR. >From my perspective, the most important part here has been 'verifying' the cause (in this case by using a prospective solution), because it was not something that was easy to repro. If you have a better solution for it, I don't have any concerns if that PR is rejected. But perhaps the new test in the branch is useful for an alternative solution in this case. On Fri, May 15, 2026 at 6:11 PM Yishay Weiss <[email protected]> wrote: > Sorry Greg, I'll use these jars next week. Didn't get to it... > ________________________________ > From: Greg Dove <[email protected]> > Sent: Friday, May 15, 2026 3:35 AM > To: [email protected] <[email protected]> > Subject: Re: Non-deterministic output issues > > Hi Josh, > > I don't yet have what I feel is a definitive justification that the 'fix' > works, because testers have not had enough time with it. But for those that > have been using it, thus far they have not encountered the problem. > So I am not making a PR yet, because I would like to be more confident that > this at least fixes it - that might take a few more days. However I did > push the commits to the following branch: > https://github.com/apache/royale-compiler/tree/local_var_resolution_issue > in case you want to look at it. > > The symptom was a rare failure to find the correct type of a local var > inside a function scope. Usually the first one worked, but others did not. > > The assumption for how it occurs is: > 1.Memory Pressure: A GC event clears the FunctionNode but leaves the > FunctionScope intact. > 2.Scope Reconnection: When the FileNode is re-parsed, it creates a new > "shell" FunctionNode. The existing FunctionScope is reconnected to this new > node. > 3.The "Wipe": To ensure no stale data remains, the reconnection process > explicitly wipes all local variable definitions from the scope. > 4.The Failure: Under certain conditions, the compiler would attempt to > resolve a name (look up a variable) after the wipe but before the function > body was fully re-parsed. Because the scope was empty, the resolution would > either return null (untyped) or incorrectly find something with the same > name in an outer scope (e.g., a class member). > > The general 'solution' is intended as a sort of "just-in-time" restoration > of local definitions by ensuring that the act of asking for a definition > triggers the re-parsing of the body if it is missing > Because it is so hard to repro, the sequence 1-4 above is still an assumed > cause, which is why I need to wait for confirmation of it not happening any > more from others who were seeing it more routinely than I did. But it does > match the symptoms exactly, which makes me hopeful. > > There is a new test in there (NameResolutionAfterGCTest) that is intended > to simulate something close to the above. > > There are a few other changes in the branch - changes related to cache, and > also (something I discovered as another 'rare' thing after the name lookup > changes, when running regular royale framework build) to thread-safety with > metatag Array - that might need more work/attention. So far I see no > noticeable adverse effect to performance. > There are a few other minor changes I used during logging that I left in > there as well. > > Anyhow, feel free to come up with a better way to do this if it is obvious > to you (assuming it is confirmed to be the cause of the problem - I hope it > is). > > Thanks > -Greg > > > On Thu, May 14, 2026 at 8:34 AM Greg Dove <[email protected]> wrote: > > > > > Sounds good Josh, I will do that. I haven't heard feedback yet from day > > one of others testing the patched compiler. I will wait one more day, > > (still with my fingers crossed) before I even assume that it works. > > > > > > On Thu, May 14, 2026 at 4:48 AM Josh Tynjala <[email protected]> > > wrote: > > > >> Thanks, Greg. I guess a branch is good, in case I'd like to suggest any > >> tweaks. > >> > >> -- > >> Josh Tynjala > >> Bowler Hat LLC > >> https://bowlerhat.dev/ > >> > >> > >> On Tue, May 12, 2026 at 10:10 PM Greg Dove <[email protected]> wrote: > >> > >> > Josh, I *think* it might be a combination of the two. I'm asking > others > >> who > >> > were seeing it more often than I did to test a possible fix (I will > >> share > >> > an updated compiler build with them), because repro of the actual > issue > >> is > >> > still quite challenging. > >> > *if* it works (maybe will need 1-2 days to be sure), I'll push it > either > >> > directly to dev or via branch/PR (lmk what you prefer) and I'd > certainly > >> > appreciate your review of that if possible. I did use a bit of AI > >> support > >> > for the sleuthing and the testing, but you have spent much more time > in > >> the > >> > compiler codebase than I have. > >> > For now, though, fingers crossed.... > >> > > >> > > >> > > >> > On Wed, May 13, 2026 at 4:00 AM Josh Tynjala < > [email protected] > >> > > >> > wrote: > >> > > >> > > Thanks for the update, Greg. Threading could certainly be a cause if > >> we > >> > > arre missing some kind of synchronization. I know that we have > >> > > workspace.startBuilding() and workspace.startIdleState() as ways of > >> > > ensuring threads are under control. We may be missing one of those > >> calls > >> > > somewhere before emitting JS. > >> > > > >> > > As for GC, I recall that reducing JVM memory wasn't necessarily > enough > >> > for > >> > > me to reproduce the other GC related bug I mentioned, strange as > that > >> > > seems. I remember also adding System.gc() calls in various places > >> > (though I > >> > > don't remember exactly where), and I think that's what finally > >> allowed me > >> > > to reproduce the issue semi-reliably. > >> > > > >> > > -- > >> > > Josh Tynjala > >> > > Bowler Hat LLC > >> > > https://bowlerhat.dev/ > >> > > > >> > > > >> > > On Mon, May 11, 2026 at 10:15 PM Greg Dove <[email protected]> > >> wrote: > >> > > > >> > > > Hey Josh, > >> > > > > >> > > > I am actively looking into this again. I am less convinced that it > >> is > >> > GC > >> > > > related (I reduced memory allocation to low levels) and perhaps it > >> is > >> > > more > >> > > > to do with threads/race-conditions. But it's very difficult to be > >> > sure, I > >> > > > spent today adding logging and trying to repro, but did not repro > >> the > >> > bug > >> > > > all day. I will keep on this tomorrow trying to find the right > >> > conditions > >> > > > to force it to occur. If I can figure out what those are, I will > >> share > >> > > them > >> > > > with you. > >> > > > > >> > > > -Greg > >> > > > > >> > > > > >> > > > On Tue, May 5, 2026 at 9:06 AM Greg Dove <[email protected]> > >> wrote: > >> > > > > >> > > > > Wake up brain (self talk): > >> > > > > "and then not wrong for subsequent output" <- should be of > course > >> > "and > >> > > > > then wrong for subsequent output". > >> > > > > > >> > > > > On Tue, May 5, 2026 at 9:05 AM Greg Dove <[email protected]> > >> > wrote: > >> > > > > > >> > > > >> Thanks for looking into this, Josh. > >> > > > >> > >> > > > >> "If it isn't too difficult to reproduce" > >> > > > >> Quick comments, just in case it helps: > >> > > > >> > >> > > > >> It was not something I could repro for debugging purposes in > the > >> > > > >> compiler. It was still 'rare' in practice - max 2-3 times per > day > >> > > that I > >> > > > >> observed, sometimes only once a day - and not manifesting in > the > >> > same > >> > > > code > >> > > > >> - although perhaps that is simply because code can change a lot > >> > > between > >> > > > >> compiler runs - and "awareness" was based on the app not > >> starting up > >> > > > >> correctly or noticeable runtime errors. I did not check this: > >> > perhaps > >> > > > it is > >> > > > >> happening more often than I think but with no side effects. > This > >> > could > >> > > > >> happen if it sometimes outputs a typed method as > >> instance.method() > >> > > where > >> > > > >> type resolution worked and elsewhere alongside as > >> > instance['method']() > >> > > > >> where it did not. The problem might not simply get noticed in > >> this > >> > > case, > >> > > > >> but this is pure speculation, I have not checked for this. > >> > > > >> > >> > > > >> I did not try reducing heap allocation or anything to try to > >> create > >> > > > >> conditions for it to perhaps happen more often if it is > memory/GC > >> > > > related. > >> > > > >> > >> > > > >> I see notes like this in the code: > >> > > > >> // If we get this far, then we did not find a cached entry > >> > > > >> // It is possible for 2+ threads to get in here for the same > >> name. > >> > > > >> // This is intentional - the worst that happens is that we > >> > duplicate > >> > > > the > >> > > > >> resolution work > >> > > > >> // the benefit is that we avoid any sort of locking, which was > >> > proving > >> > > > >> expensive (time wise, > >> > > > >> // and memory wise). > >> > > > >> > >> > > > >> When you see the code that was problematic output, you can see > >> the > >> > > same > >> > > > >> name lookup inside a js method that is obviously correctly > >> resolved > >> > > > >> (anecdotally it seems to be more often 'correct' the first > time) > >> and > >> > > > then > >> > > > >> not wrong for subsequent output, in nearby code, so I assume it > >> > might > >> > > be > >> > > > >> related to some unsynchronized state or failure to do that > >> > 'duplicate' > >> > > > >> resolution work, where the various parts were being processed > in > >> > > > parallel... > >> > > > >> > >> > > > >> Anyway, good luck, please let me know if you have anything you > >> > think I > >> > > > >> could do to help. > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> On Tue, May 5, 2026 at 6:29 AM Harbs <[email protected]> > >> wrote: > >> > > > >> > >> > > > >>> Sure. I’ll be in touch off list. > >> > > > >>> > >> > > > >>> > On May 4, 2026, at 9:18 PM, Josh Tynjala < > >> > > [email protected]> > >> > > > >>> wrote: > >> > > > >>> > > >> > > > >>> > Would you be willing to give me access to the project? If it > >> > isn't > >> > > > too > >> > > > >>> > difficult to reproduce, I may be able to figure out what's > >> going > >> > on > >> > > > >>> and how > >> > > > >>> > to restore the missing typing data, similar to my other fix. > >> My > >> > > > >>> feeling is > >> > > > >>> > that the original Adobe devs intended for occasional garbage > >> > > > >>> collection to > >> > > > >>> > occur to stay within memory limits, but that the data would > be > >> > > > >>> restorable, > >> > > > >>> > if needed later. I think that they simply missed some places > >> > where > >> > > it > >> > > > >>> might > >> > > > >>> > need to be restored because it happens pretty rarely. Or > maybe > >> > our > >> > > > >>> newer JS > >> > > > >>> > emitter isn't properly accounting for that possibility. > >> > > > >>> > > >> > > > >>> > -- > >> > > > >>> > Josh Tynjala > >> > > > >>> > Bowler Hat LLC > >> > > > >>> > https://bowlerhat.dev/ > >> > > > >>> > > >> > > > >>> > > >> > > > >>> > On Mon, May 4, 2026 at 10:37 AM Harbs < > [email protected]> > >> > > wrote: > >> > > > >>> > > >> > > > >>> >>> You've tested that this issue still > >> > > > >>> >>> reproduces using a compiler built from the latest source > >> code? > >> > > > >>> >> > >> > > > >>> >> This was reproduced by a number of devs all working on the > >> same > >> > > > >>> project. > >> > > > >>> >> And yes, it was with recent builds. > >> > > > >>> >> > >> > > > >>> >> I don’t think I personally have seen it (I have a lot of > >> memory > >> > on > >> > > > my > >> > > > >>> >> machine), but it seems to have gotten worse recently. I > don’t > >> > know > >> > > > if > >> > > > >>> >> something changed in the compiler or it’s due to the > >> increased > >> > > > >>> project size. > >> > > > >>> >> > >> > > > >>> >> This was with variables — not functions. > >> > > > >>> >> > >> > > > >>> >> Harbs > >> > > > >>> >> > >> > > > >>> >>> On May 4, 2026, at 6:54 PM, Josh Tynjala < > >> > > > [email protected]> > >> > > > >>> >> wrote: > >> > > > >>> >>> > >> > > > >>> >>> This issue may be the same one: > >> > > > >>> >>> > >> > > > >>> >>> https://github.com/apache/royale-compiler/issues/182 > >> > > > >>> >>> > >> > > > >>> >>> I also encountered and fixed an issue related weak > >> references a > >> > > > >>> little > >> > > > >>> >> over > >> > > > >>> >>> a year ago. Function bodies were getting garbage > collected, > >> > and I > >> > > > >>> needed > >> > > > >>> >> to > >> > > > >>> >>> clear out some stale definitions that were causing missing > >> > > classes > >> > > > in > >> > > > >>> >>> generated ASDoc output and some similar issues with the > >> -watch > >> > > > >>> compiler > >> > > > >>> >>> option. > >> > > > >>> >>> > >> > > > >>> >>> > >> > > > >>> >> > >> > > > >>> > >> > > > > >> > > > >> > > >> > https://github.com/apache/royale-compiler/commit/35eed62f13519c659e6346d26cca3f44afe3170f > >> > > > >>> >>> > >> > > > >>> >>> This fix does not appear to have made it into a release > yet. > >> > > You're > >> > > > >>> not > >> > > > >>> >>> using an older compiler build, right? You've tested that > >> this > >> > > issue > >> > > > >>> still > >> > > > >>> >>> reproduces using a compiler built from the latest source > >> code? > >> > > > >>> >>> > >> > > > >>> >>> -- > >> > > > >>> >>> Josh Tynjala > >> > > > >>> >>> Bowler Hat LLC > >> > > > >>> >>> https://bowlerhat.dev/ > >> > > > >>> >>> > >> > > > >>> >>> > >> > > > >>> >>> On Sun, May 3, 2026 at 9:40 PM Greg Dove < > >> [email protected]> > >> > > > >>> wrote: > >> > > > >>> >>> > >> > > > >>> >>>> Compiler issues - (Josh, please?) > >> > > > >>> >>>> > >> > > > >>> >>>> We have a medium-sized project that has begun > encountering > >> > > > >>> >> occasional/rare > >> > > > >>> >>>> (but at least daily during normal workloads) compilation > >> > issues > >> > > > that > >> > > > >>> >> appear > >> > > > >>> >>>> to be related to name/type resolution. There can be code > >> > within > >> > > a > >> > > > >>> method > >> > > > >>> >>>> output where the name resolves correctly to its type in > one > >> > part > >> > > > of > >> > > > >>> the > >> > > > >>> >>>> method's js output and elsewhere within the same js > method > >> > > output > >> > > > >>> as if > >> > > > >>> >> it > >> > > > >>> >>>> was Object/untyped. This is most obvious with XML or > >> XMLList > >> > > > >>> instances > >> > > > >>> >>>> (because of .child('prop') vs ['prop] differences). I've > >> also > >> > > seen > >> > > > >>> it > >> > > > >>> >> get > >> > > > >>> >>>> confused between local variables and instance properties > in > >> > some > >> > > > >>> cases, > >> > > > >>> >>>> which I believe is a manifestation of the same thing. In > >> other > >> > > > >>> words, > >> > > > >>> >>>> different compilation runs with the exact same settings > are > >> > not > >> > > > >>> >>>> completely deterministic, because sometimes they can > >> provide > >> > > > >>> different > >> > > > >>> >>>> output. It is very difficult to repro, because it feels > so > >> > > random. > >> > > > >>> But > >> > > > >>> >> it > >> > > > >>> >>>> has been something that appears to be more frequent as > the > >> > > > codebase > >> > > > >>> >> grows > >> > > > >>> >>>> (when all other settings remain the same). This led me to > >> > > consider > >> > > > >>> that > >> > > > >>> >> it > >> > > > >>> >>>> could be GC-related, and I recently removed the > >> SoftReferences > >> > > > >>> inside > >> > > > >>> >>>> ASScopeCache, as a prime suspect. > >> > > > >>> >>>> > >> > > > >>> >>>> After doing this, I have not seen the problem since (so > >> far - > >> > > > after > >> > > > >>> 1.5 > >> > > > >>> >>>> days) > >> > > > >>> >>>> > >> > > > >>> >>>> The ASScopeCache instances themselves are weakly held > >> (inside > >> > > > >>> >>>> CompilerProject). So the internal maps inside each of > these > >> > > > >>> instances > >> > > > >>> >> being > >> > > > >>> >>>> weakly held as well seems to be the problem, the internal > >> maps > >> > > can > >> > > > >>> >> perhaps > >> > > > >>> >>>> get into a partially cleared state between threads. > >> > > > >>> >>>> > >> > > > >>> >>>> I did some memory profiling with and without this change > >> for > >> > > > >>> removing > >> > > > >>> >> the > >> > > > >>> >>>> SoftReferences inside ASScopeCache - but it was quite > >> limited > >> > > > (just > >> > > > >>> >> testing > >> > > > >>> >>>> with compiling the one project). The memory usage was not > >> much > >> > > > >>> >> different on > >> > > > >>> >>>> a typical run (approx 1Mb difference for a compilation > with > >> > > around > >> > > > >>> 1000 > >> > > > >>> >> .as > >> > > > >>> >>>> and .mxml files combined, alongside a bunch of local > swcs). > >> > > There > >> > > > >>> was > >> > > > >>> >>>> possibly a small speed up without the SoftReferences, > but I > >> > did > >> > > > not > >> > > > >>> test > >> > > > >>> >>>> enough to be sure. > >> > > > >>> >>>> But so far it seems there is not a big impact on memory > >> with > >> > > > >>> omitting > >> > > > >>> >>>> these. If it introduces consistency I'm kinda keen to get > >> it > >> > in > >> > > > >>> there - > >> > > > >>> >> I > >> > > > >>> >>>> know others have definitely seen this problem too. > >> > > > >>> >>>> And for Josh in particular: I think your compiler > >> experience > >> > > > dwarfs > >> > > > >>> the > >> > > > >>> >>>> rest of us and wanted to get your feedback instead of > just > >> > > jumping > >> > > > >>> in > >> > > > >>> >> with > >> > > > >>> >>>> this one. One option could also be to make this change > as a > >> > > > compiler > >> > > > >>> >>>> option, with the new non-weak references being the > default, > >> > but > >> > > > >>> with the > >> > > > >>> >>>> ability to switch to the older behaviour via the option > if > >> > that > >> > > > was > >> > > > >>> >>>> considered important as well... look forward to hearing > >> your > >> > > > >>> thoughts. > >> > > > >>> >>>> > >> > > > >>> >> > >> > > > >>> >> > >> > > > >>> > >> > > > >>> > >> > > > > >> > > > >> > > >> > > >
