The first thing the newly forked job will do anyway is to update its job file as the first op-code is now waiting. It is enough that this information is replicated to all master candidates. Note that a local change of the livelock file will not be helpful to other master candidates, as the livelock has only a meaning on the current node anyway.
This seemingly little saving of two replications is significant, however, as it happens under the fork lock by which we avoid two forks at the same time, as this can cause problems with the Haskell runtime. Signed-off-by: Klaus Aehlig <[email protected]> Reviewed-by: Petr Pudlak <[email protected]> Cherry-picked-from: 7684a50192bb Signed-off-by: Klaus Aehlig <[email protected]> --- src/Ganeti/JQScheduler.hs | 2 +- src/Ganeti/JQueue.hs | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/Ganeti/JQScheduler.hs b/src/Ganeti/JQScheduler.hs index de14512..cfd355c 100644 --- a/src/Ganeti/JQScheduler.hs +++ b/src/Ganeti/JQScheduler.hs @@ -419,7 +419,7 @@ scheduleSomeJobs qstate = do mapM_ (attachWatcher qstate) chosen -- Start the jobs. - result <- JQ.startJobs cfg (jqLivelock qstate) (jqForkLock qstate) jobs + result <- JQ.startJobs (jqLivelock qstate) (jqForkLock qstate) jobs let badWith (x, Bad y) = Just (x, y) badWith _ = Nothing let failed = mapMaybe badWith $ zip chosen result diff --git a/src/Ganeti/JQueue.hs b/src/Ganeti/JQueue.hs index c0422bc..6040249 100644 --- a/src/Ganeti/JQueue.hs +++ b/src/Ganeti/JQueue.hs @@ -536,15 +536,15 @@ isQueueOpen :: IO Bool isQueueOpen = liftM not (jobQueueDrainFile >>= doesFileExist) -- | Start enqueued jobs by executing the Python code. -startJobs :: ConfigData - -> Livelock -- ^ Luxi's livelock path +startJobs :: Livelock -- ^ Luxi's livelock path -> Lock -- ^ lock for forking new processes -> [QueuedJob] -- ^ the list of jobs to start -> IO [ErrorResult QueuedJob] -startJobs cfg luxiLivelock forkLock jobs = do +startJobs luxiLivelock forkLock jobs = do qdir <- queueDir let updateJob job llfile = - void . writeAndReplicateJob cfg qdir $ job { qjLivelock = Just llfile } + void . mkResultT . writeJobToDisk qdir + $ job { qjLivelock = Just llfile } let runJob job = withLock forkLock $ do (llfile, _) <- Exec.forkJobProcess (qjId job) luxiLivelock (updateJob job) -- 2.2.0.rc0.207.ga3a616c
