On Wed, Aug 16, 2017 at 7:46 AM, Guillermo Polito <guillermopol...@gmail.com > wrote:
> > On Tue, Aug 15, 2017 at 3:25 PM, Ben Coman <b...@openinworld.com> wrote: > >> In case any of the shutdown/startup scripts use a delay, now or in the >> future, >> I'd first try at highestPriority-1 to avoid influence on the >> DelayScheduler. >> but then Eliot's suggestion to valueUnpreemptively may avoid that anyway. >> > > Why should a shutdown/startup use a delay? > > startup and shutdown should be fast and not be blocked... > okay. My response was a just reflex to not assume what some third-party developer might do. I'm not familiar with the startup/shutdown and had not considered it deeply. > If there is a delay on client code, it should block a client thread, not a > system thread... > > Moreover, I see a series of issues in having the delay process running in > higher priority than the startup. If I'm wrong, please correct me because > otherwise that means there is something I'm not getting. > > First, today the Delay scheduling process is being terminated on shutdown > and being re-initialized on startup. This means that even if a > shutdown/startup action tries to use a delay that will fail/block > indefinitely? > That is not how I understand it works. The DelayScheduler process is not terminated, just suspended by DelayMicrosecondScheduler>>shutdown. #stopTimerEventLoop is not called during shutdown. > > Second, what happens with race conditions between the startup and the > delay process? If the shutdown is in the middle of terminating the delay > process and the delay process gets suddenly activated? > > stopTimerEventLoop > "Stop the timer event loop" > > timerEventLoop ifNotNil: [ timerEventLoop terminate ]. > timerEventLoop := nil. > That is effectively only called when changing which between different DelayScheduler-implementations. > Maybe before terminating the timerEventLoop we need to suspend it? That > will at least atomically (primitive) remove the process from the ready list > and avoid it from being activated again, no? > I'm not completely happy with how the DelayScheduler shutdown happens now, whether some case might cause timingSemaphore to be triggered to run, and how #save/#restoreResumptionTimes are called from low-priority code which feels susceptible to race conditions (although I've not identified any). A while a go I was playing with the idea that to **ensure** the timerEventLoop doesn't run, you add another semaphore in middle activated by #shutDown, - but it was near a Pharo 5 release so I didn't push it in, and then failed to get back to it. From memory something like... DelayScheduler>>handleTimerEventLoop: microsecondNowTick .... suspendDelaysSemaphore ifNotNil: [ self saveResumptionTimes. suspendDelaysSemaphore wait. suspendDelaysSemaphore := nil. self restoreResumptionTimes. ]. DelayScheduler>>shutDown suspendDelaysSemaphore := Semaphore new. DelayScheduler>>startUp suspendDelaysSemaphore signal. I feel a dedicated sempahore would be more robust to control of the suspension of the DelayScehduler during shutdown/start, rather than leaving it waiting on the timingSemaphore that several things interact with and hoping none signal it. Maybe it time for me to try for my first Pharo 7 contribution. cheers -ben > In any case, I see no good in letting a delay work on startup. That is far > too low level and the system would be in a far too unstable state to run > any code other than the startup itself. > > >> btw, what happens if an error occurs inside valueUnpreemptively? >> Does the normal priority debugger still get to run? >> >> cheers -ben >> >> On Mon, Aug 14, 2017 at 6:42 PM, Guillermo Polito < >> guillermopol...@gmail.com> wrote: >> >>> Hi all, >>> >>> I'm proposing a kind-of critical change that I believe is very good for >>> the health of the system: I want that the startup of the system runs in >>> maximum priority and becomes non-interruptable. >>> >>> Right now, when you save your image, the shutdown and startup are run in >>> the same priority than the process that triggered the save (usually the ui >>> or the command line, priority 40). This can cause lots of problems and race >>> conditions: processes with higher priorities can interrupt the >>> shutdown/startup and try to do something while the system is unstable. As a >>> side effect also, when you use extensively the command line, you start >>> stacking startup contexts from old sessions: >>> >>> ... >>> session 3 ctxt 4 <- This guy makes a save and a new session starts >>> session 3 ctxt 3 >>> session 3 ctxt 2 >>> session 3 ctxt 1 >>> session 2 ctxt 4 <- This guy makes a save and a new session starts >>> session 2 ctxt 3 >>> session 2 ctxt 2 >>> session 2 ctxt 1 >>> session 1 ctxt 4 <- This guy makes a save and a new session starts >>> session 1 ctxt 3 >>> session 1 ctxt 2 >>> session 1 ctxt 1 >>> >>> Old contexts are never collected, and the objects they referenced >>> neither. >>> >>> To fix these two problems I propose to do every image save/session start >>> in a new process in maximum priority. That way, other process should not be >>> able to interrupt the startup process. Moreover, every session >>> shutdown/startup should happen in a new clean process, to avoid the session >>> stacking. >>> >>> For normal users, this should have no side effect at all. This change >>> will have a good impact on people working on the debugger and the stack >>> such as fueling-out the stack because they will have a cleaner stack. >>> >>> There is however a side-effect/design point to consider: startup actions >>> should be quick to run. If a startup action requires to run a long-running >>> action such as starting a server or managing a command line action, that >>> should run in a separate process with lower priority (usually >>> userPriority). In other words, the startup action should create a new >>> process managing its action. >>> >>> If you want to review (and I'd be glad) >>> >>> Pull request: https://github.com/pharo-project/pharo/pull/198 >>> Fogbugz issue: https://pharo.fogbugz.com/f/cases/20309 >>> Current validation going on: https://ci.inria.fr/pharo-ci-j >>> enkins2/job/Test%20pending%20pull%20request%20and%20branch%2 >>> 0Pipeline/view/change-requests/job/PR-198/ >>> >>> Guille >>> >>> -- >>> >>> >>> >>> Guille Polito >>> >>> >>> Research Engineer >>> >>> French National Center for Scientific Research - *http://www.cnrs.fr* >>> <http://www.cnrs.fr> >>> >>> >>> >>> *Web:* *http://guillep.github.io* <http://guillep.github.io> >>> >>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013> >>> >> >> > > > -- > > > > Guille Polito > > > Research Engineer > > French National Center for Scientific Research - *http://www.cnrs.fr* > <http://www.cnrs.fr> > > > > *Web:* *http://guillep.github.io* <http://guillep.github.io> > > *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013> >