Hi Ben, I opened this issue for tracking the problem and not forget: https://pharo.fogbugz.com/f/cases/18359/Problem-with-DelayExperimentalSpinScheduler-and-delay
Cheers, On Fri, May 20, 2016 at 2:58 PM, Ben Coman <b...@openinworld.com> wrote: > On Sat, May 21, 2016 at 12:05 AM, Mariano Martinez Peck > <marianop...@gmail.com> wrote: > > Ben, for the record, I am using DelayMillisecondScheduler for a day and a > > half and so far no problem. > > Cool. Thats why I left it there. I hope to soon have something for you > to try with the newer design. Thanks for the update. > > cheers -ben > > > On Thu, May 19, 2016 at 9:19 AM, Mariano Martinez Peck > > <marianop...@gmail.com> wrote: > >> > >> > >> > >> On Wed, May 18, 2016 at 9:49 PM, Martin McClure <mar...@hand2mouse.com> > >> wrote: > >>> > >>> On 05/18/2016 03:17 PM, Martin McClure wrote: > >>>> > >>>> On 05/18/2016 08:49 AM, Mariano Martinez Peck wrote: > >>>>> > >>>>> Hi guys, > >>>>> > >>>>> I am seeing a problem in Pharo 5.0 regarding Delay >> wait. I cannot > >>>>> explain how this could happened but it does, and it happened to me a > couple > >>>>> of times (but not fully reproducible). > >>>>> > >>>> > >>>> Hmm. The schedulerResumptionTime is, somehow, being (approximately) > >>>> doubled. It's not clear how that can happen, but I'll look a little > more. > >>>> > >>> > >>> Mario, is there any chance that you might be saving the image during > one > >>> of these Delays? > >>> > >>> > >>> This one smells like a race condition, and I think I see something that > >>> *might* explain it. But I don't have any more time to spend on this > one, so > >>> I'll leave the rest to someone else. I hope this is helpful: > >>> > >>> The only way I immediately see for the schedulerResumptionTime to > become > >>> approximately doubled is if the Delay's resumption time is adjusted by > >>> #restoreResumptionTimes without previously having been adjusted by > >>> #saveResumptionTimes. > >>> > >>> The only time either of those are sent, that I can see, is on saving > the > >>> image. Both are normally sent, (save before the snapshot, restore > >>> afterwards), but there may be a hole there. > >>> > >> > >> Martin, first off, thanks for the research!!! > >> > >> Now....your email made me remember something: I did get VM crash when > >> saving the image a couple of times. The VM crashed when saving the > image. If > >> I re-opened the image, it looks like if the image was indeed saved (so > the > >> snapshot primitive itself did work), but I suspect not all shutdown code > >> could have been run correctly. > >> > >> The VM crash looks like the FreeTypeFace >> pvtDestroyHandle which, as > >> far as I know, it's a "known crash" (I attach crash dump). From what I > can > >> see, if I follow all the stack, the crash starts from the WeakArray >> > >> startUp: . > >> That means that...depending on the order of the startup list...the > >> Scheduler may not have been run after the crash. > >> > >> Now.... WeakArray initialization does: > >> > >> SessionManager default > >> registerSystemClassNamed: self name. > >> While... > >> > >> Delay class >> startUp "Restart active delay, if any, when resuming a > >> snapshot." Scheduler startUp. > >> > >> And the Delay registration is > >> > >> SessionManager default > >> registerSystemClassNamed: self name > >> atPriority: 20. > >> > >> So...that seems correct... > >> > >> I can verify this by: > >> > >> SessionManager default systemCategory prioritizedList > >> > >> Anyway...not sure if this adds something, but just wanted to note this. > >> > >> > >>> > >>> #saveResumptionTimes is only sent (by this scheduler class) when the > >>> accessProtect semaphore is held, but #handleTimerEvent: is executed in > the > >>> timing Process *without* the protection of accessProtect, in the case > of the > >>> VM signaling the timingSemaphore. If the VM signals the > timingSemaphore, > >>> #handleTimerEvent: could run in the middle of #saveResumptionTimes. If > some > >>> Delay expires because of that timer event, our Delay could move from > being > >>> the first suspended delay to being the active delay. If that happens > after > >>> we've adjusted the active delay, but before we've processed the > suspended > >>> delays, that Delay will not get adjusted, and will show the symptoms > that > >>> Mariano is seeing. > >>> > >>> Also, I'm not sure how the Heap that holds the suspendedDelays will > react > >>> to being modified in the middle of an enumeration. That might open a > larger > >>> window for the problem. > >>> > >>> Regards, > >>> > >>> -Martin > >>> > >> > >> > >> > >> -- > >> Mariano > >> http://marianopeck.wordpress.com > > > > > > > > > > -- > > Mariano > > http://marianopeck.wordpress.com > > -- Mariano http://marianopeck.wordpress.com