Re: [jruby-dev] Improving load time by parallelizing load/parse?

Charles Oliver Nutter Tue, 25 Oct 2011 09:58:54 -0700

Here's some example runs of `rake test` in Rails, using just a single
suite and using nailgun to speed things up:


https://gist.github.com/1313476

I'm running more tests with the parallel_load branch.

- Charlie

On Tue, Oct 25, 2011 at 10:51 AM, Charles Oliver Nutter
<head...@headius.com> wrote:
> So here's one discovery...I turned on the JVM's sampling profiler
> (--sample flag to JRuby) when running "rake test" and discovered that
> it causes *four* JVM processes to be launched. Seriously?
>
> If they're all booting Rails, it's no wonder "rake test" takes
> forever. I'm looking into it now.
>
> - Charlie
>
> On Tue, Oct 25, 2011 at 10:36 AM, Charles Oliver Nutter
> <head...@headius.com> wrote:
>> That's a big unknown for us. It does not seem to be heavily IO-driven,
>> since using nailgun does help "pure load" scenarios speed up
>> significantly. For example:
>>
>> INIT OF JRUBY ALONE
>>
>> system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5
>>                          user     system      total        real
>> in-process `jruby `   0.043000   0.000000   0.043000 (  0.027000)
>>                          user     system      total        real
>> in-process `jruby `   0.045000   0.000000   0.045000 (  0.045000)
>>                          user     system      total        real
>> in-process `jruby `   0.018000   0.000000   0.018000 (  0.018000)
>>                          user     system      total        real
>> in-process `jruby `   0.014000   0.000000   0.014000 (  0.014000)
>>                          user     system      total        real
>> in-process `jruby `   0.014000   0.000000   0.014000 (  0.014000)
>>
>> INIT OF JRUBY PLUS -rubygems
>>
>> system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 -rubygems
>>                          user     system      total        real
>> in-process `jruby -rubygems`  0.193000   0.000000   0.193000 (  0.177000)
>>                          user     system      total        real
>> in-process `jruby -rubygems`  0.085000   0.000000   0.085000 (  0.085000)
>>                          user     system      total        real
>> in-process `jruby -rubygems`  0.085000   0.000000   0.085000 (  0.085000)
>>                          user     system      total        real
>> in-process `jruby -rubygems`  0.071000   0.000000   0.071000 (  0.071000)
>>                          user     system      total        real
>> in-process `jruby -rubygems`  0.076000   0.000000   0.076000 (  0.076000)
>>
>> ...PLUS require 'activerecord'
>>
>> system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 "-rubygems
>> -e \"require 'activerecord'\""
>>                          user     system      total        real
>> in-process `jruby -rubygems -e "require 'activerecord'"`  0.192000
>> 0.000000   0.192000 (  0.176000)
>>                          user     system      total        real
>> in-process `jruby -rubygems -e "require 'activerecord'"`  0.087000
>> 0.000000   0.087000 (  0.087000)
>>                          user     system      total        real
>> in-process `jruby -rubygems -e "require 'activerecord'"`  0.087000
>> 0.000000   0.087000 (  0.087000)
>>                          user     system      total        real
>> in-process `jruby -rubygems -e "require 'activerecord'"`  0.069000
>> 0.000000   0.069000 (  0.069000)
>>                          user     system      total        real
>> in-process `jruby -rubygems -e "require 'activerecord'"`  0.078000
>> 0.000000   0.078000 (  0.078000)
>>
>> Note how much startup improves for subsequent runs in the -rubygems
>> and -r activerecord cases. If it were solely IO-bound, we wouldn't see
>> that much improvement.
>>
>> Startup time issues are a combination of factors:
>>
>> * IO, including filesystem searching and the actual read of the file
>> * Parsing and AST building
>> * JVM being cold; our parser, interpreter, core classes are all
>> running at their slowest
>> * Internal caches getting vigorously flushed at boot, since there's so
>> many methods and constants being created
>>
>> My parallelizing patch helps the first three but didn't make a big
>> difference in actual execution of commands like "rake test" in a Rails
>> app. I'm going to poke at startup a bit more today and see if I can
>> figure out how much time in "rake test" is *actually* booting versus
>> execution.
>>
>> - Charlie
>>
>> On Mon, Oct 24, 2011 at 11:56 PM, Andrew Cholakian <and...@andrewvc.com> 
>> wrote:
>>> I'm wondering how much of the issue is IO and how much is CPU time required
>>> to parse. Would it be easiest to just do a quick scan for module
>>> dependencies and cache all the files ASAP, then parse serially? I'm not sure
>>> if it'd be possible to do a quick parse for just 'require'.
>>>
>>> On Mon, Oct 24, 2011 at 9:47 PM, Jonathan Coveney <jcove...@gmail.com>
>>> wrote:
>>>>
>>>> I was thinking about the case below, and I think that this is an
>>>> interesting idea, but I'm wondering how you would resolve certain
>>>> difficulties. Imagine:
>>>>
>>>> require 'ALib'
>>>> a = 10+2
>>>> require 'BLib'
>>>> b=a/2
>>>>
>>>> where ALib is a lot of random stuff, then:
>>>> class Fixnum
>>>>   def +(other)
>>>>     self*other
>>>>   end
>>>> end
>>>>
>>>> and BLib is a lot of random stuff, then:
>>>> class Fixnum
>>>>   def /(other)
>>>>     self*other*other
>>>>   end
>>>> end
>>>>
>>>> How would you know how to resolve these various pieces? I guess you
>>>> mention eager interpreting and then a cache, but given that any module can
>>>> change any other module's functionality, you would have to keep track of
>>>> everything that you eagerly interpreted, and possibly go back depending on
>>>> what your module declares. How else would you know that a module that
>>>> doesn't depend on any other modules is going to actually execute in a
>>>> radically different way because of another module that you have included?
>>>> The only way I can think of would be if the thread executing any given 
>>>> piece
>>>> of code kept track of the calls that it made and where, and then went back
>>>> to the earliest piece it had to in the case that anything was
>>>> rewritten...but then you could imagine an even more convoluted case where
>>>> module A changes an earlier piece of module B such that it changes how a
>>>> later piece of itself works...and so on.
>>>>
>>>> Perhaps this is incoherent, but I think the question of how you deal with
>>>> the fact that separately running pieces of code can change the fundamental
>>>> underlying state of the world.
>>>>
>>>> 2011/10/24 Charles Oliver Nutter <head...@headius.com>
>>>>>
>>>>> Nahi planted an interesting seed on Twitter...what if we could
>>>>> parallelize parsing of Ruby files when loading a large application?
>>>>>
>>>>> At a naive level, parallelizing the parse of an individual file is
>>>>> tricky to impossible; the parser state is very much straight-line. But
>>>>> perhaps it's possible to parallelize loading of many files?
>>>>>
>>>>> I started playing with parallelizing calls to the parser, but that
>>>>> doesn't really help anything; every call to the parser blocks waiting
>>>>> for it to complete, and the contents are not interpreted until after
>>>>> that point. That means that "require" lines remain totally opaque,
>>>>> preventing us from proactively starting threaded parses of additional
>>>>> files. But there lies the opportunity: what if load/require requests
>>>>> were done as Futures, require/load lines were eagerly interpreted by
>>>>> submitting load/require requests to a thread pool, and child requires
>>>>> could be loading and parsing at the same time as the parent
>>>>> file...without conflicting.
>>>>>
>>>>> In order to do this, I think we would need to make the following
>>>>> modifications:
>>>>>
>>>>> * LoadService would need to explose Future-based versions of "load"
>>>>> and "require". The initial file loaded as the "main" script would be
>>>>> synchronous, but subsequent requires and loads could be shunted to a
>>>>> thread pool.
>>>>> * The parser would need to initiate eager load+parser of files
>>>>> encountered in require-like and load-like lines. This load+parse would
>>>>> encompass filesystem searching plus content parsing, so all the heavy
>>>>> lifting of booting a file would be pushed into the thread pool.
>>>>> * Somewhere (perhaps in LoadService) we would maintain an LRU cache
>>>>> mapping from file paths to ASTs. The cache would contain Futures;
>>>>> getting the actual parsed library would then simply be a matter of
>>>>> Future.get, allowing many of the load+parses to be done
>>>>> asynchronously.
>>>>>
>>>>> For a system like Rails, where there might be hundreds of files
>>>>> loaded, this could definitely improve startup performance.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> - Charlie
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe from this list, please visit:
>>>>>
>>>>>    http://xircles.codehaus.org/manage_email
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [jruby-dev] Improving load time by parallelizing load/parse?

Reply via email to