Hi Ben,

I opened this issue for tracking the problem and not forget:
https://pharo.fogbugz.com/f/cases/18359/Problem-with-DelayExperimentalSpinScheduler-and-delay

Cheers,

On Fri, May 20, 2016 at 2:58 PM, Ben Coman <b...@openinworld.com> wrote:

> On Sat, May 21, 2016 at 12:05 AM, Mariano Martinez Peck
> <marianop...@gmail.com> wrote:
> > Ben, for the record, I am using DelayMillisecondScheduler for a day and a
> > half and so far no problem.
>
> Cool. Thats why I left it there. I hope to soon have something for you
> to try with the newer design. Thanks for the update.
>
> cheers -ben
>
> > On Thu, May 19, 2016 at 9:19 AM, Mariano Martinez Peck
> > <marianop...@gmail.com> wrote:
> >>
> >>
> >>
> >> On Wed, May 18, 2016 at 9:49 PM, Martin McClure <mar...@hand2mouse.com>
> >> wrote:
> >>>
> >>> On 05/18/2016 03:17 PM, Martin McClure wrote:
> >>>>
> >>>> On 05/18/2016 08:49 AM, Mariano Martinez Peck wrote:
> >>>>>
> >>>>> Hi guys,
> >>>>>
> >>>>> I am seeing a problem in Pharo 5.0 regarding Delay >> wait. I cannot
> >>>>> explain how this could happened but it does, and it happened to me a
> couple
> >>>>> of times (but not fully reproducible).
> >>>>>
> >>>>
> >>>> Hmm. The schedulerResumptionTime is, somehow, being (approximately)
> >>>> doubled. It's not clear how that can happen, but I'll look a little
> more.
> >>>>
> >>>
> >>> Mario, is there any chance that you might be saving the image during
> one
> >>> of these Delays?
> >>>
> >>>
> >>> This one smells like a race condition, and I think I see something that
> >>> *might* explain it. But I don't have any more time to spend on this
> one, so
> >>> I'll leave the rest to someone else. I hope this is helpful:
> >>>
> >>> The only way I immediately see for the schedulerResumptionTime to
> become
> >>> approximately doubled is if the Delay's resumption time is adjusted by
> >>> #restoreResumptionTimes without previously having been adjusted by
> >>> #saveResumptionTimes.
> >>>
> >>> The only time either of those are sent, that I can see, is on saving
> the
> >>> image. Both are normally sent, (save before the snapshot, restore
> >>> afterwards), but there may be a hole there.
> >>>
> >>
> >> Martin, first off, thanks for the research!!!
> >>
> >> Now....your email made me remember something: I did get VM crash when
> >> saving the image a couple of times. The VM crashed when saving the
> image. If
> >> I re-opened the image, it looks like if the image was indeed saved (so
> the
> >> snapshot primitive itself did work), but I suspect not all shutdown code
> >> could have been run correctly.
> >>
> >> The VM crash looks like the FreeTypeFace >> pvtDestroyHandle  which, as
> >> far as I know, it's a "known crash" (I attach crash dump). From what I
> can
> >> see, if I follow all the stack, the crash starts from the WeakArray >>
> >> startUp: .
> >> That means that...depending on the order of the startup list...the
> >> Scheduler may not have been run after the crash.
> >>
> >> Now.... WeakArray initialization does:
> >>
> >> SessionManager default
> >> registerSystemClassNamed: self name.
> >> While...
> >>
> >> Delay class >> startUp "Restart active delay, if any, when resuming a
> >> snapshot." Scheduler startUp.
> >>
> >> And the Delay registration is
> >>
> >> SessionManager default
> >> registerSystemClassNamed: self name
> >> atPriority: 20.
> >>
> >> So...that seems correct...
> >>
> >> I can verify this by:
> >>
> >> SessionManager default systemCategory prioritizedList
> >>
> >> Anyway...not sure if this adds something, but just wanted to note this.
> >>
> >>
> >>>
> >>> #saveResumptionTimes is only sent (by this scheduler class) when the
> >>> accessProtect semaphore is held, but #handleTimerEvent: is executed in
> the
> >>> timing Process *without* the protection of accessProtect, in the case
> of the
> >>> VM signaling the timingSemaphore. If the VM signals the
> timingSemaphore,
> >>> #handleTimerEvent: could run in the middle of #saveResumptionTimes. If
> some
> >>> Delay expires because of that timer event, our Delay could move from
> being
> >>> the first suspended delay to being the active delay. If that happens
> after
> >>> we've adjusted the active delay, but before we've processed the
> suspended
> >>> delays, that Delay will not get adjusted, and will show the symptoms
> that
> >>> Mariano is seeing.
> >>>
> >>> Also, I'm not sure how the Heap that holds the suspendedDelays will
> react
> >>> to being modified in the middle of an enumeration. That might open a
> larger
> >>> window for the problem.
> >>>
> >>> Regards,
> >>>
> >>> -Martin
> >>>
> >>
> >>
> >>
> >> --
> >> Mariano
> >> http://marianopeck.wordpress.com
> >
> >
> >
> >
> > --
> > Mariano
> > http://marianopeck.wordpress.com
>
>


-- 
Mariano
http://marianopeck.wordpress.com

Reply via email to