To have a single-process ghc --make -j you first of all need internal thread-safety:
GHC internally keeps a number of global caches that need to be made thread-safe: - table of interned strings (this is actually written in C and accessed via FFI) - cache of interface files loaded, these are actually loaded lazily using unsafeInterleaveIO magic (yuck) - cache of packages descriptions (I think) - the NameCache: a cache of string -> magic number. This is used to implement fast comparisons between symbols. The magic numbers are generated non-deterministically (more unsafeInterleaveIO) so you need to keep this cache around. - HomeModules: These are the modules that have been compiled in this --make run. The NameCache is used when loading interface files and also by the Parser. Making these things thread-safe basically involves updating these caches via atomicModifyIORef instead of just modifyIORef. I made those changes a few years ago, but at least one of them was rolled-back. I forgot the details, but I think it was one use of unsafePerformIO that caused the issues. unsafePerformIO needs to traverse the stack to look for thunks that are potentially evaluated by multiple threads. If you have a deep stack that can be expensive. SimonM since added stack chunks which should reduce the overhead of this. Could be worthwhile re-evaluating the patch. To have a multi-process ghc --make you don't need thread-safety. However, without sharing the caches -- in particular the interface file caches -- the time to read data from the disk may outweigh any advantages from parallel execution. Evan's approach of using a long-running worker process avoids issues with reloading the most of the caches for each module, but it probably couldn't take advantage of the HomeModule cache. It would be interesting to see if that was the issues. Then, it would be interesting if the disk access or the serialisation overhead is the issue; if it's the former, some clever use of mmap could help. HTH, / Thomas On 13 May 2013 17:35, Evan Laforge <qdun...@gmail.com> wrote: > I wrote a ghc-server that starts a persistent process for each cpu. > Then a 'ghc' frontend wrapper sticks each job in a queue. It seemed > to be working, but timing tests didn't reveal any speed-up. Then I > got a faster computer and lost motivation. I didn't investigate very > deeply why it didn't speed up as I hoped. It's possible the approach > is still valid, but I made some mistake in the implementation. > > So I can stop writing this little blurb I put it on github: > > https://github.com/elaforge/ghc-server > > On Mon, May 13, 2013 at 8:40 PM, Niklas Hambüchen <m...@nh2.me> wrote: >> I know this has been talked about before and also a bit in the recent >> GSoC discussion. >> >> I would like to know what prevents ghc --make from working in parallel, >> who worked at that in the past, what their findings were and a general >> estimation of the difficulty of the problem. >> >> Afterwards, I would update >> http://hackage.haskell.org/trac/ghc/ticket/910 with a short summary of >> what the current situation is. >> >> Thanks to those who know more! >> >> _______________________________________________ >> Haskell-Cafe mailing list >> Haskell-Cafe@haskell.org >> http://www.haskell.org/mailman/listinfo/haskell-cafe > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe