On Wed, May 18, 2016 at 11:39 PM, Ben Coman <b...@openinworld.com> wrote:

> On Thu, May 19, 2016 at 8:49 AM, Martin McClure <mar...@hand2mouse.com>
> wrote:
> > On 05/18/2016 03:17 PM, Martin McClure wrote:
> >>
> >> On 05/18/2016 08:49 AM, Mariano Martinez Peck wrote:
> >>>
> >>> Hi guys,
> >>>
> >>> I am seeing a problem in Pharo 5.0 regarding Delay >> wait. I cannot
> >>> explain how this could happened but it does, and it happened to me a
> couple
> >>> of times (but not fully reproducible).
> >>>
> >>
> >> Hmm. The schedulerResumptionTime is, somehow, being (approximately)
> >> doubled. It's not clear how that can happen, but I'll look a little
> more.
> >>
> >
> > Mario, is there any chance that you might be saving the image during one
> of
> > these Delays?
> >
> >
> > This one smells like a race condition, and I think I see something that
> > *might* explain it. But I don't have any more time to spend on this one,
> so
> > I'll leave the rest to someone else. I hope this is helpful:
> >
> > The only way I immediately see for the schedulerResumptionTime to become
> > approximately doubled is if the Delay's resumption time is adjusted by
> > #restoreResumptionTimes without previously having been adjusted by
> > #saveResumptionTimes.
> >
> > The only time either of those are sent, that I can see, is on saving the
> > image. Both are normally sent, (save before the snapshot, restore
> > afterwards), but there may be a hole there.
> >
> > #saveResumptionTimes is only sent (by this scheduler class) when the
> > accessProtect semaphore is held, but #handleTimerEvent: is executed in
> the
> > timing Process *without* the protection of accessProtect, in the case of
> the
> > VM signaling the timingSemaphore. If the VM signals the timingSemaphore,
> > #handleTimerEvent: could run in the middle of #saveResumptionTimes. If
> some
> > Delay expires because of that timer event, our Delay could move from
> being
> > the first suspended delay to being the active delay. If that happens
> after
> > we've adjusted the active delay, but before we've processed the suspended
> > delays, that Delay will not get adjusted, and will show the symptoms that
> > Mariano is seeing.
>
> A quick experiment to test this might be in shutDown/#startUp trying...
>     [ self saveResumptionTimes ] valueAt: Processor timingPriority
>     [ self resumeResumptionTimes ] valueAt: Processor timingPriority
>
> >
> > Also, I'm not sure how the Heap that holds the suspendedDelays will
> react to
> > being modified in the middle of an enumeration. That might open a larger
> > window for the problem.
> >
> > Regards,
> >
> > -Martin
> >
>
> Even if not directly related to Mariano's problem, I agree with your
> general assessment.  I'm not comfortable with the way that
> #save/#restoreResumptionTimes (which manipulate suspendedDelays) are
> called from user priority code via #shutDown/#startUp.  Per the
> original code**, accessProtect can't be used inside the timing
> priority #handleTimerEvent since accessProtect is held by the user
> priority #schedule when it uses "timingSemaphore signal" to invoke
> invokes #handleTimerEvent.  accessProtect never protected
> timingPriority manipulation of suspendedDelays by #handleTimerEvent,
> nor expired delays waking up.  But ahhh... the disabling of
> accessProtect previously prevented new delays being scheduled between
> a #save and #restore. If a new delay is scheduled after the #save,
> when it is #restore'd its resumptionTime would be wrong.
>
> Waiting in the wings for Pharo 6 I have changes that should help:
> * have #save/#restoreResumptionTimes *only* called from timing
> priority event loop (i.e. #handleTimerEvent)
> * shutDown/startUp suspends/resumes the timing priority event loop,
> instead of trying to block signals to timingSemaphore
>
> I haven't touched it for a few months so I'll need to chase it up to
> provide a preview.


>
> Mariano, can you try DelayMillisecondScheduler (which however is
> missing some fixes for other issues).
>
>

Ben,

Did you find a chance to do something else with this? I am still finding
lookups every in a while even with the DelayMillisecondScheduler  :(

Thanks!




-- 
Mariano
http://marianopeck.wordpress.com

Reply via email to