Here's some example runs of `rake test` in Rails, using just a single suite and using nailgun to speed things up:
https://gist.github.com/1313476 I'm running more tests with the parallel_load branch. - Charlie On Tue, Oct 25, 2011 at 10:51 AM, Charles Oliver Nutter <head...@headius.com> wrote: > So here's one discovery...I turned on the JVM's sampling profiler > (--sample flag to JRuby) when running "rake test" and discovered that > it causes *four* JVM processes to be launched. Seriously? > > If they're all booting Rails, it's no wonder "rake test" takes > forever. I'm looking into it now. > > - Charlie > > On Tue, Oct 25, 2011 at 10:36 AM, Charles Oliver Nutter > <head...@headius.com> wrote: >> That's a big unknown for us. It does not seem to be heavily IO-driven, >> since using nailgun does help "pure load" scenarios speed up >> significantly. For example: >> >> INIT OF JRUBY ALONE >> >> system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 >> user system total real >> in-process `jruby ` 0.043000 0.000000 0.043000 ( 0.027000) >> user system total real >> in-process `jruby ` 0.045000 0.000000 0.045000 ( 0.045000) >> user system total real >> in-process `jruby ` 0.018000 0.000000 0.018000 ( 0.018000) >> user system total real >> in-process `jruby ` 0.014000 0.000000 0.014000 ( 0.014000) >> user system total real >> in-process `jruby ` 0.014000 0.000000 0.014000 ( 0.014000) >> >> INIT OF JRUBY PLUS -rubygems >> >> system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 -rubygems >> user system total real >> in-process `jruby -rubygems` 0.193000 0.000000 0.193000 ( 0.177000) >> user system total real >> in-process `jruby -rubygems` 0.085000 0.000000 0.085000 ( 0.085000) >> user system total real >> in-process `jruby -rubygems` 0.085000 0.000000 0.085000 ( 0.085000) >> user system total real >> in-process `jruby -rubygems` 0.071000 0.000000 0.071000 ( 0.071000) >> user system total real >> in-process `jruby -rubygems` 0.076000 0.000000 0.076000 ( 0.076000) >> >> ...PLUS require 'activerecord' >> >> system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 "-rubygems >> -e \"require 'activerecord'\"" >> user system total real >> in-process `jruby -rubygems -e "require 'activerecord'"` 0.192000 >> 0.000000 0.192000 ( 0.176000) >> user system total real >> in-process `jruby -rubygems -e "require 'activerecord'"` 0.087000 >> 0.000000 0.087000 ( 0.087000) >> user system total real >> in-process `jruby -rubygems -e "require 'activerecord'"` 0.087000 >> 0.000000 0.087000 ( 0.087000) >> user system total real >> in-process `jruby -rubygems -e "require 'activerecord'"` 0.069000 >> 0.000000 0.069000 ( 0.069000) >> user system total real >> in-process `jruby -rubygems -e "require 'activerecord'"` 0.078000 >> 0.000000 0.078000 ( 0.078000) >> >> Note how much startup improves for subsequent runs in the -rubygems >> and -r activerecord cases. If it were solely IO-bound, we wouldn't see >> that much improvement. >> >> Startup time issues are a combination of factors: >> >> * IO, including filesystem searching and the actual read of the file >> * Parsing and AST building >> * JVM being cold; our parser, interpreter, core classes are all >> running at their slowest >> * Internal caches getting vigorously flushed at boot, since there's so >> many methods and constants being created >> >> My parallelizing patch helps the first three but didn't make a big >> difference in actual execution of commands like "rake test" in a Rails >> app. I'm going to poke at startup a bit more today and see if I can >> figure out how much time in "rake test" is *actually* booting versus >> execution. >> >> - Charlie >> >> On Mon, Oct 24, 2011 at 11:56 PM, Andrew Cholakian <and...@andrewvc.com> >> wrote: >>> I'm wondering how much of the issue is IO and how much is CPU time required >>> to parse. Would it be easiest to just do a quick scan for module >>> dependencies and cache all the files ASAP, then parse serially? I'm not sure >>> if it'd be possible to do a quick parse for just 'require'. >>> >>> On Mon, Oct 24, 2011 at 9:47 PM, Jonathan Coveney <jcove...@gmail.com> >>> wrote: >>>> >>>> I was thinking about the case below, and I think that this is an >>>> interesting idea, but I'm wondering how you would resolve certain >>>> difficulties. Imagine: >>>> >>>> require 'ALib' >>>> a = 10+2 >>>> require 'BLib' >>>> b=a/2 >>>> >>>> where ALib is a lot of random stuff, then: >>>> class Fixnum >>>> def +(other) >>>> self*other >>>> end >>>> end >>>> >>>> and BLib is a lot of random stuff, then: >>>> class Fixnum >>>> def /(other) >>>> self*other*other >>>> end >>>> end >>>> >>>> How would you know how to resolve these various pieces? I guess you >>>> mention eager interpreting and then a cache, but given that any module can >>>> change any other module's functionality, you would have to keep track of >>>> everything that you eagerly interpreted, and possibly go back depending on >>>> what your module declares. How else would you know that a module that >>>> doesn't depend on any other modules is going to actually execute in a >>>> radically different way because of another module that you have included? >>>> The only way I can think of would be if the thread executing any given >>>> piece >>>> of code kept track of the calls that it made and where, and then went back >>>> to the earliest piece it had to in the case that anything was >>>> rewritten...but then you could imagine an even more convoluted case where >>>> module A changes an earlier piece of module B such that it changes how a >>>> later piece of itself works...and so on. >>>> >>>> Perhaps this is incoherent, but I think the question of how you deal with >>>> the fact that separately running pieces of code can change the fundamental >>>> underlying state of the world. >>>> >>>> 2011/10/24 Charles Oliver Nutter <head...@headius.com> >>>>> >>>>> Nahi planted an interesting seed on Twitter...what if we could >>>>> parallelize parsing of Ruby files when loading a large application? >>>>> >>>>> At a naive level, parallelizing the parse of an individual file is >>>>> tricky to impossible; the parser state is very much straight-line. But >>>>> perhaps it's possible to parallelize loading of many files? >>>>> >>>>> I started playing with parallelizing calls to the parser, but that >>>>> doesn't really help anything; every call to the parser blocks waiting >>>>> for it to complete, and the contents are not interpreted until after >>>>> that point. That means that "require" lines remain totally opaque, >>>>> preventing us from proactively starting threaded parses of additional >>>>> files. But there lies the opportunity: what if load/require requests >>>>> were done as Futures, require/load lines were eagerly interpreted by >>>>> submitting load/require requests to a thread pool, and child requires >>>>> could be loading and parsing at the same time as the parent >>>>> file...without conflicting. >>>>> >>>>> In order to do this, I think we would need to make the following >>>>> modifications: >>>>> >>>>> * LoadService would need to explose Future-based versions of "load" >>>>> and "require". The initial file loaded as the "main" script would be >>>>> synchronous, but subsequent requires and loads could be shunted to a >>>>> thread pool. >>>>> * The parser would need to initiate eager load+parser of files >>>>> encountered in require-like and load-like lines. This load+parse would >>>>> encompass filesystem searching plus content parsing, so all the heavy >>>>> lifting of booting a file would be pushed into the thread pool. >>>>> * Somewhere (perhaps in LoadService) we would maintain an LRU cache >>>>> mapping from file paths to ASTs. The cache would contain Futures; >>>>> getting the actual parsed library would then simply be a matter of >>>>> Future.get, allowing many of the load+parses to be done >>>>> asynchronously. >>>>> >>>>> For a system like Rails, where there might be hundreds of files >>>>> loaded, this could definitely improve startup performance. >>>>> >>>>> Thoughts? >>>>> >>>>> - Charlie >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe from this list, please visit: >>>>> >>>>> http://xircles.codehaus.org/manage_email >>>>> >>>>> >>>> >>> >>> >> > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email