On Sun, Jul 26, 2009 at 12:01 AM, Subramanya Sastry<[email protected]> wrote:
>> However--as I have pointed out to Charlie a number of times--in practice,
>> classes are basically frozen after *some* time. In Rails, pretty much all
>> classes reach their final stage at the end of the bootup phase. However,
>> since JRuby only sees a parse phase and then a generic "runtime" it's not
>> possible for it to determine when that has happened.
>
> This may not be as hard. If the assumption is that the compiler is going to
> optimize for the case where all class mods. are going to happen in some
> startup phase, then there are two approaches:
>
> 1. During the initial interpretation phase, track the profile of # of class
> mods, and implicitly plot a curve of # mods across time. And, at the point
> when slope of this curve starts flattening out, you assume that you are
> getting out of the boot phase.
>
> 2. Use a variant of exponential/random backoff technique: i.e. each time a
> class mod. is encountered, you back off compilation for another X amount of
> time where X is varied after each class mod. If you also add a condition
> that you need to hit at least N clean backoff phases (those that don't see
> any class mods). At that time, start compilation.
>
> Note that these techniques will not work as well for cases where this code
> modification profile isn't met. Alternatively, you could develop different
> compilation strategies for different code modification profiles ... One for
> Rails, one for something else, etc. and use commandline option to select the
> appropriate compilation strategy.
Yeah, this is all good. I had not thought about having a global or
per-class profile of changes over time or of having a backoff
mechanism to delay compilation further. Multiple profiles makes sense
too, since we can learn from framework authors what runtime
characteristics their frameworks have.
There's also the option, for specific use cases, of allowing users to
freeze classes at some specific point in time. From then on, we
consider the class unmodifiable, and use that information to better
optimize calls against it.
> But, Tom has a good argument that eval is probably not used as often, at
> least not in loops. In addition the parse cost of the eval may be
> substantially higher, so, methods that use eval may not benefit much from
> optimizing surrounding code anyway, so throwing up our hands and doing the
> simple thing as above (allocate a frame, load/store live vars. to memory)
> might be good enough.
In fact our current deoptimization strategy for methods containing
"eval" works pretty well already: we just make the entire containing
method use a heap-based scope. It may not even be worth refining our
deopt mechanism any further given that it's so rare to see
perf-critical code calling "eval" or non-evil methods named "eval".
> It is good to hear from everyone about common code patterns and what the
> common scenarios are, and what needs to be targeted. But, the problem often
> is that ensuring correctness for the 1% (or even 0.1%) uncommon case might
> effectively block good performance for the 99% common case. The specifics
> will depend on the specific case being considered.
There's two schools of thought here...one is that we could explicitly
define the optimization characteristics of the system and say "if you
do X after point Y, your changes won't be visible to compiled code."
In the 0.1 or 0.01% cases, this may be acceptable, and perhaps nobody
will ever be impacted by it. But no matter how small the likelihood,
we can't claim that our optimizations are 100% non-damaging to normal,
expected Ruby behavior. Whether we can bend the rules of what is
"normal" or "expected" is more a political debate than a technical
one.
The other school of thought is that we must be slavishly 100%
compatible all the time. I think our lack of an aliasable "eval"
proves that's not the case; there *are* things that people simply *do
not do*, and we do not need to always allow them to penalize
performance. And taking an even stronger position, we can always say
"this is how JRuby works; it's not compatible, but it's what we needed
to do to get performance for the 99% case" such as we did with
un-aliasable "eval". Generally people won't complain, and if they do
they won't actually be affected by it.
> Fixnums are probably a good example. In the absence of OSR or external
> guarantees that fixnum methods are not modified, you are forced either to
> not optimize fixnums to regular ints, or introduce guards before most calls
> to check for class mods. We could compile optimistically assuming that
> fixnum.+ is not modified ever, with the strategy that if fixnum.+ is indeed
> modified, we will back off and deoptimize. But, this requires ability to do
> an on-stack replacement of currently executing code. But since you dont
> control the JVM, you won't be able to do an OSR. Barring other tricks, this
> effectively kills optimistic compilation. I am still hoping some trick can
> be found that enables optimistic compilation without requiring external
> (programmer) guarantees, but nothing has turned up yet so far.
>
> On the other hand, you could introduce guards after all method calls to
> check that fixnum.+ is not modified. This is definitely an option, but is a
> lot of overhead (for numeric computations relative to most other languages)
> simply because of the possibility that someone somewhere has decided that
> overriding fixnum.+ is a good thing!
>
> So, this is one example where correctness requirements for the uncommon case
> gets in the way of higher performance for the common case. eval is another
> example where the 1% case gets in the way, but Tom is right that parsing
> overhead is probably the higher cost there anyway. So, we should
> investigate in greater detail the different 99%-1% scenarios to investigate
> what it takes to not let the 1% uncommon case not hit performance for the
> 99% common case scenario.
I definitely expect there to be many different scenarios that we'll
want to handle differently. The override deopt case for Fixnum is
extremely rare, and even rarer if you consider that such changes (if
ever made) are nearly always done long before anything gets compiled.
We'd generally know well in advance that a method on Fixnum has been
replaced, and can simply not do Fixnum optimizations.
We can also take an approach like --fast and just turn on fast Fixnum
math all the time, without any guards. If people really need to be
able to replace Fixnum#+, they can turn optimizations off. Again, only
affecting a minute percentage of users.
> Note that I am using relatively loose language w.r.t. 'performance' -- many
> of the uses of this word begs the question of 'relative to what'. At some
> point, this language also needs to be tightened up.
That's certainly true. If we can get decent performance relative to a
previous JRuby version, we're making progress. If we can do well
compared to the core implementations of 1.8 and 1.9, we're doing well.
And if we can do well compared to LLVM-based implementations like
MacRuby and Rubinius that have tagged pointers and specific math
optimizations, we're doing great. Even then we'd be quite a bit slower
than Java, but it wouldn't matter a whole lot because we'd be among
the fastest Ruby implementations.
There's also another key point Tom constantly reminds me of: the
majority of Ruby application performance is not lost due to Ruby code
execution speed, but due to the speed of the core classes. If only 10%
of system performance relates to Ruby code execution, and we double
it, we've only gained a measly 5%. But if we double the performance of
the remaining 90% (presumably core classes), we improve overall perf
by 45%. It's a much bigger job, of course, but it helps put things in
perspective. It's probably better for us to be moderately
underoptimized than to have dismally inefficient core classes, if we
had to choose.
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email