On Fri, Oct 25, 2013 at 12:19 PM, Ben Karel <[email protected]> wrote:

> On Fri, Oct 25, 2013 at 11:54 AM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> Thanks for clarifying. I looked at the description, and it seems that
>> they have a fairly 1:1 encoding from IR instructions to bitcode
>> instructions, such that the majority of bitcode instructions translate 1:1
>> into target instructions on RISC machines. That would make it a DBT scheme
>> in my mind.
>>
>
> Bitcode is just a binary representation of LLVM IR.  A translator from
> LLVM to machine code still needs to do register allocation, which makes it
> not-DBT by your earlier classification, right?
>

I think that's right.

To be honest, I don't have a feature-oriented view of what constitutes DBT.
The thing that differentiates JIT from DBT in my mind is that JIT seeks to
optimize the output code where DBT seeks to minimize the translation time.
Dawson Engler made a similar distinction in his work on DPF and his xcode
translation scheme. In fact, xcode was one of the things that informed our
approach in HDTrans.

If I remember right, the LLVM IR is essentailly an SSA form, so register
coloring can be done in polynomial time. Which is good, but you're not
going to get a translation rate anywhere near 12-20 host instructions per
translated instruction. Their bitcode language makes provision for inlining
(both mandatory and advisory), which drives that polynomial up pretty fast.
That, in turn, pushes you to do more optimization in the dynamic translator
to make up the cost of dynamic translation. And there's absolutely nothing
wrong with doing that, but all of a sudden you have an order of magnitude
more translation work - or even two orders of magnitude - invested that you
have to amortize.

What I'm saying is that the complexity difference in translation approaches
isn't a continuous gradient; it's a shelf function. The impact of that
shelf function is to increase the amount of code you have to optimize and
translate and run in order to amortize the cost of translation. The more
work you do, the less credible the story is about turning translation on
and off on-the-fly as a way to do read barriers.

And that's why the DBT I'm talking about for garbage collection purposes
operates on native instructions where registers are already allocated. On
x86 we didn't need a scratch register, but on a RISC, I might even restrict
the static compilation scheme to leave a scratch register available for
DBT. So what I'm contemplating is a cooperative scheme with *very* low
run-time translation overhead.

I'm not sure that makes things any clearer, though. Does it?


BTW, the issue I'm concerned about here is basically the same issue that is
driving the "patchpoint" IR idea in LLVM, except that they seem to want to
patch the code in place. That defeats binary code sharing, and it deprives
them of any payback from trace construction, which is why I don't lean that
way.

But having said that, my thinking is probably biased by the fact that I'm
coming from the HDTrans experience. It's very possible that everything
looks like a nail as a result.


shap

- When C++ is your hammer, everything looks like your thumb.
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to