Re: Truffle and mlvm

Thomas E Enebo Mon, 06 Oct 2014 08:17:19 -0700

Benoit,

Always a pleasure working with you :)  If you have any questions or need
assistance you can IM me or talk on #jruby on irc.  Concurrency in
Ruby+Truffle is a pretty big missing item; so it will be great to see this
implemented.


-Tom


On Thu, Oct 2, 2014 at 7:26 AM, Benoit Daloze <erego...@gmail.com> wrote:

> Hello Charles and Thomas,
>
> On 31 August 2014 00:04, Thomas Wuerthinger <thomas.wuerthin...@oracle.com
> > wrote:
>
>> Thanks a lot, Charlie, for this very detailed feedback! In fact, this is
>> probably the most comprehensive feedback we’ve received for Truffle so far
>> :).
>>
>> There are some valid points, some points where I’d like to add a comment,
>> and some where Graal and Truffle may have been misunderstood. I’ll try to
>> address them in a similarly structured form:
>>
>> Regarding disadvantage 1 “AST is not enough”:
>> A. Yes, you need to provide specialisations for your operations for
>> better performance. This makes it harder to implement a Truffle AST
>> interpreter than a simple AST interpreter. It has however the advantage
>> that it gives you more predictable performance for the different usages of
>> your dynamic language operation. We are currently working on source code
>> visualisations for Truffle ASTs so that users can see what nodes are
>> specialised and to what types.
>> B. Yes, it is necessary to store your local variables in a Truffle frame
>> object. This object can however contain pointers to arbitrary extra data
>> necessary for your guest language.
>> C. We did a lot of improvements to Truffle, Graal, and also TruffleRuby
>> since January. Inlining works without any problems and independent of the
>> guest language, also in TruffleRuby.
>>
>> Regarding disadvantage 2 “Long startup and warmup times”:
>> The benefit of the system is absolutely *not* lost when the compiler
>> (Graal) and the Truffle guest language interpreter are AOT compiled. It
>> gives you decent startup and high peak. The interpreter is immediately
>> available in highly optimized machine code. Hot guest language methods are
>> dynamically compiled to machine code by the precompiled Graal.
>>
>> Regarding disadvantage 3 “Limited concurrency”:
>> There is no deeper reason why TruffleRuby is single threaded right now.
>> For sure none that has to do with the fundamentals of the Truffle approach.
>> We are planning to support 100% multi-threading also in TruffleRuby. One of
>> the explorations we are currently doing is to support guest language level
>> safepoints such that guest language developers themselves can easier deal
>> with concurrency without compromising any peak performance.
>>
>
> I plan to work on concurrency in Truffle during my PhD in Linz.
> Making Truffle thread-safe is a priority and some work is already done for
> AST replacements.
> I am also interested in supporting different concurrency primitives such
> as threads and fibers for the guest languages.
>
> Benoit
>
> Regarding disadvantage 4 “Limited availability”:
>> Yes, this is indeed a chicken and egg problem. Truffle is however not as
>> closely tied to Graal as you suggest here. I believe that it is fairly
>> straightforward to create a Truffle front-end for C2 (or any other compiler
>> supporting deoptimization). There are only 3k LOC in Graal that are
>> specific to Truffle. I think that they could be ported in a reasonable time
>> frame. The Truffle interpreters themselves run on any Java system even if
>> it has only very limited features - this is actually an advantage over a
>> pure bytecode generation approach. They can also be AOT compiled for
>> devices that do not support a full JVM and have strong footprint
>> requirements. This would of course be slower execution than in a full
>> fledged VM, but it would at least run correctly. I furthermore think that
>> it is possible to do the Truffle partial evaluation via bytecode generation
>> for backwards compatibility.
>>
>> Regarding disadvantage 5 "Unclear benefits for real-world applications”:
>> This kind of argument can hardly be countered before a system is 100%
>> finished and shipped. The term “real world” is also somewhat loosely
>> defined. I would very much support the development of a JRuby benchmark
>> suite that tries to reflect “real world” as close as possible.
>> There is absolutely no reason to believe that a Truffle-based Ruby
>> implementation would not have benefits for “real-world applications”. Or
>> that it would not be able to run a large application for a long time. It is
>> clear that the TruffleRuby prototype needs more completeness work both at
>> the language and the library level. We are very happy with the results we
>> got so far with Chris working for about a year. We are planning to increase
>> the number of people working on this, and would also be grateful for any
>> help we can get from the Ruby community.
>>
>> Regarding Graal:  Did you ever try to benchmark JRuby without Truffle
>> with the latest Graal binaries available at
>> http://lafo.ssw.uni-linz.ac.at/builds/? We would be looking forward to
>> see the peak performance results on a couple of workloads. We are not
>> speculating about Graal becoming part of a particular OpenJDK release (as
>> experimental or regular option). This is the sovereign decision of the
>> OpenJDK community. All we can do is to demonstrate and inform about Graal’s
>> performance and stability.
>>
>> We recognise that there is a long road ahead. But in particular in this
>> context, I would like to emphasize that we are looking for more people to
>> support this effort for a new language implementation platform. I strongly
>> believe that Truffle is the best currently available vehicle to make Ruby
>> competitive in terms of performance with node.js. We are happy to try to
>> *prove* you wrong - even happier about support of any kind along the road
>> ;). I am also looking forward to continue this discussion at JavaOne (as
>> part of the TruffleRuby session or elsewhere).
>>
>> Regards, thomas
>>
>> On 30 Aug 2014, at 21:21, Charles Oliver Nutter <head...@headius.com>
>> wrote:
>>
>> > Removing all context, so it's clear this is just my opinions and
>> thoughts...
>> >
>> > As most of you know, we've opened up our codebase and incorporated the
>> > graciously-donated RubyTruffle directly into JRuby. It's available on
>> > JRuby master and we are planning to ship Truffle support with JRuby
>> > 9000, our next major version (due out in the next couple months).
>> >
>> > At the same time, we have been developing our own next-gen IR-based
>> > compiler, which will run unmodified on any JVM (with or without
>> > invokedynamic, though I still have to implement the "without" side).
>> > Why are we doing this when Truffle shows such promise?
>> >
>> > I'll try to enumerate the benefits and problems of Truffle here.
>> >
>> > * Benefits of using Truffle
>> >
>> > 1. Simpler implementation.
>> >
>> > From day 1, the most obvious benefit of Truffle is that you just have
>> > to write an AST interpreter. Anyone who has implemented a programming
>> > language can do this easily. This specific benefit doesn't help us
>> > implement JRuby, since we already have an AST interpreter, but it did
>> > make Chris Seaton's job easier building RubyTruffle initially. This
>> > also means a Truffle-based language is more approachable than one with
>> > a complicated compiler pipeline of its own.
>> >
>> > 2. Better communication with the JIT.
>> >
>> > Truffle, via Graal, has potential to pass much more information on to
>> > the JIT. Things like type shape, escaped references, frame access,
>> > type specialization, and so on can be communicated directly, rather
>> > than hoping and praying they'll be inferred by the shape of bytecodes.
>> > This is probably the largest benefit; much of my time optimizing JRuby
>> > has been spend trying to "trick" C2 into doing the right thing, since
>> > I don't have a direct way to communicate intent.
>> >
>> > The peak performance numbers for Truffle-based languages have been
>> > extremely impressive. If it's possible to get those numbers reasonably
>> > quickly and with predictable steady-state behavior in large,
>> > heterogeneous codebases, this is definitely the quickest path (on any
>> > runtime) to a high-performance language implementation.
>> >
>> > 3. OSS and pure Java
>> >
>> > Truffle and Graal are just OpenJDK projects under OpenJDK licenses,
>> > and anyone can build, hack, or distribute them. In addition, both
>> > Truffle and Graal are 100% Java, so for the first time a plain old
>> > Java developer can see (and manipulate) exactly how the JIT works
>> > without getting lost in a sea of plus plus.
>> >
>> > * Problems with Truffle
>> >
>> > I want to emphasize that regardless of its warts, we love Truffle and
>> > Graal and we see great potential here. But we need a dose of reality
>> > once in a while, too.
>> >
>> > 1. AST is not enough.
>> >
>> > In order to make that AST fly, you can't just implement a dumb generic
>> > interpreter. You need to know about (and generously annotate your AST
>> > for) many advanced compiler optimization techniques:
>> >
>> > A. Type specialization plus guarded fallbacks: Truffle will NOT
>> > specialize your code for you. You must provide every specialized path
>> > in your AST nodes as well as annotating "slow path", "transfer to
>> > interpreter", etc.
>> >
>> > B. Frame access and reification: In order to have cross-call access to
>> > frames or to squash frames created for multiple inlined calls, you
>> > must use Truffle's representation of a frame. This means loads/stores
>> > within your AST must be done against a Truffle object, not against an
>> > arbitrary object of your own creation.
>> >
>> > C. Method invocation and inlining: Up until fairly recently, if you
>> > wanted to inline methods you had to essentially build your own call
>> > site logic, profiling, deopt paths within your Truffle AST. When I did
>> > a little hacking on RubyTruffle around OSS time (December/January) it
>> > did *no* inlining of Ruby-to-Ruby calls. I hacked in inlining using
>> > existing classes and managed to get it to work, but I was doing all
>> > the plumbing myself. I know this has improved in the Truffle codebase
>> > since then, but I have my concerns about production readiness when the
>> > inlining call site parts of Truffle were just recently added and are
>> > still in flux.
>> >
>> > And there's plenty of other cases. Building a basic language for
>> > Truffle is pretty easy (I did a micro-language in about two hours at
>> > JVMLS last year), but building a high-performance language for Truffle
>> > still takes a fair investment of effort and working knowledge of
>> > dynamic compiler optimizations.
>> >
>> > 2. Long startup and warmup times.
>> >
>> > As Thomas pointed out in the other thread, because Truffle and Graal
>> > are normally run as plain Java libraries, they can actually aggravate
>> > startup time issues. Now, not only would all of JRuby have to warm up,
>> > but the eventual native code JIT has to warm up too. This is not
>> > surprising, really. It is possible to mitigate this by doing some form
>> > of AOT against Graal, but for every case I have seen the Truffle/Graal
>> > approach makes startup time much, much worse compared to just running
>> > atop JVM.
>> >
>> > Warmup time is also worsened significantly.
>> >
>> > The AST you create for Truffle must be heavily mutated while running
>> > in order to produce a specialized version of that AST. This must
>> > happen before the AST is eventually fed into Graal, which means you
>> > have a self-modifying interpreter spinning AST objects like mad while
>> > executing the early phases of your application. Compare to a dumb
>> > interpreter as in JRuby's old AST, where interpreting the AST produces
>> > no additional objects other than those necessary for execution of the
>> > code.
>> >
>> > The Truffle approach itself adds overhead too. Until optimized, the
>> > fully-reified frame objects, specialization markup (which triggers AST
>> > rewriting), deoptimization guards, and so on are all done manually
>> > against heap-level data structures. This is in addition to the
>> > JVM-level overhead of executing an AST (native frame-per-node, boxing
>> > and type-widening, poor inlining profile).
>> >
>> > Some amount of AOT *might* be applicable here, but the benefit of
>> > Truffle and Graal is lost in the AOT case if we're not getting
>> > real-world profile information. The Substrate VM has ben brought up to
>> > aid startup and warmup too...but that direction produces a
>> > closed-world executable based on optimizing all code up front...not
>> > exactly what we're looking for in a general-purpose language runtime.
>> >
>> > 3. Limited concurrency
>> >
>> > The RubyTruffle runtime currently has to execute code under the
>> > watchful eye of a global lock. Yes, you read that right...RubyTruffle
>> > is single-threaded right now.
>> >
>> > I would like to know if there's a deeper reason for this, but the
>> > obvious shallow reason is that you can't have multiple threads
>> > executing at the same time if they're making thread-unsafe mutations
>> > to the executing code. This is similar to the major stumbling block
>> > for e.g. Pypy, which rewrites currently-executing assembly
>> > instructions at deopt/reopt safe points.
>> >
>> > I believe once the code has transitioned to native, you can execute
>> > that safely across threads...but this is opaque to your Truffle-based
>> > language, and it's unclear how you'd manage re-acquiring some sort of
>> > lock when transferring back to the interpreter.
>> >
>> > The fact that concurrency has so far been hand-waved (or so it seems
>> > to me from the outside) scares the living hell out of me, especially
>> > when there's talk about rolling this stuff into Java 9.
>> >
>> > Obviously some of this could be mitigated with an immutable AST
>> > structure or other thread-friendly tree-transformation algorithm, but
>> > making the Truffle AST thread-safe may also make it even more
>> > object-heavy during interpretation, aggravating startup time further.
>> >
>> > 4. Limited availability
>> >
>> > This is the chicken-and-egg issue. Truffle is just a library, so we
>> > can ignore that for the moment (given any JVM, you can run a Truffle
>> > language).
>> >
>> > Graal is required for Truffle to perform well at all. The Truffle
>> > interpreter is without a doubt the slowest interpreter we've ever had
>> > for JRuby, and that's saying something (there could be startup/warmup
>> > effects in play here too). In order for us to go 100% Truffle, we'd
>> > need a Graal VM. That limits us to either pre-release or hand-made
>> > builds of Graal/OpenJDK. Even if Graal somehow did get into Java 9,
>> > we'd still have legions of users on 8, 7, ... even 6 in some cases,
>> > though we're probably leaving them behind with JRuby 9000. Ignoring
>> > other platforms (non-OpenJDK, Android) and assuming Graal in Java 9,
>> > I'd conservatively estimate JRuby could still not go 100% Truffle
>> > until 2017 or later.
>> >
>> > And it gets worse. Graal will probably never exist on other JVMs.
>> > Graal will probably never exist in an Android VM. Graal may not even
>> > be available in other non-Oracle OpenJDK derivatives for a very long
>> > time. We have users on dozens of different platform/JVM combinations,
>> > so there's really no practical way for us to abandon our JVM bytecode
>> > runtimes in the near future.
>> >
>> > Now of course if Graal became essential to users, it would be
>> > available in more places. We recognize the potential of Truffle and
>> > Graal, which is why we've been thrilled to work with Oracle on a
>> > RubyTruffle that's part of JRuby. We also recognize that the
>> > Truffle/Graal approach has some very compelling features for our
>> > users, and that our users may often be comfortable running custom
>> > JVMs. We're allowing all flowers to bloom and our users will pick the
>> > ones that work for them.
>> >
>> > 5. Unclear benefits for real-world applications
>> >
>> > There have been many published microbenchmarks for Truffle-based
>> > languages, but very few benchmarks of real-world applications
>> > performing significantly better than custom-made VMs (JS versus V8).
>> > There have been practically no studies of a Truffle-based language
>> > running a large application for a long period of time...and by long I
>> > mean server-scale.
>> >
>> > Chris Seaton has pushed this forward recently for Ruby, getting
>> > general-purpose, numeric-heavy libraries to run and optimize very well
>> > (a png library and a psd library). Going deeper requires having more
>> > of the language's standard libraries to be available, and I believe
>> > this is where Chris has spent much of his time (RubyTruffle currently
>> > requires mostly-custom versions of JRuby's core classes...versions
>> > that Truffle can recognize, specialize, and escape-analyze away).
>> >
>> > * Conclusion
>> >
>> > I again want to emphasize that we think Truffle and Graal are really
>> > awesome technology. I spent years with my nose smooshed against the
>> > glass, watching the Pypy guys add optimizations I wanted and make good
>> > on their promise of "just implement an interpreter...we'll do the
>> > rest". Finally we have what I wanted: a Pypy for JVM (in Truffle) and
>> > an LLVM for JVM (in Graal). These are exciting times indeed.
>> >
>> > But reality steps in. There's a long road ahead.
>> >
>> > I think we need to separate the questions about Truffle from questions
>> > about Graal. Truffle is ultimately just a library that uses Graal.
>> >
>> > Graal is promising JIT technology. Graal is simpler than C2 and may be
>> > able to match or beat its performance. Graal provides a better way to
>> > communicate intent to the JIT. These facts are not in question.
>> >
>> > However, Graal is not (other than when used as the JVM's JIT) a JVM.
>> > Targeting Graal directly acts against the promise of a standard,
>> > platform-and-VM-agnostic bytecode -- and that's the promise that
>> > brought most of us here. Graal is not yet ready to replace C2, which
>> > would mean adding to the size and complexity of Java 9. And Graal is
>> > almost completely untested in large production settings.
>> >
>> > I personally would love to see Graal get into a Java release soon as
>> > an experimental feature, but Java 9 seems ambitious but any standard.
>> > It *might* be possible/reasonable to include Graal as experimental in
>> > 9. Java 10 is certainly feasible for experimental, and may be feasible
>> > for product. But even if Graal got into mainstream OpenJDK and Java,
>> > there's a very long adoption tail ahead.
>> >
>> > I'd like to hear more from folks on the Graal and Truffle teams. Prove
>> > me wrong :-)
>> >
>> > - Charlie
>> > _______________________________________________
>> > mlvm-dev mailing list
>> > mlvm-dev@openjdk.java.net
>> > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>
>>
>
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
>


-- 
blog: http://blog.enebo.com       twitter: tom_enebo
mail: tom.en...@gmail.com

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Re: Truffle and mlvm

Reply via email to