I'd made some decent progress on using KVM as a CPU model. Execution got
decently far along, but then the problem I had when I stopped working on it
was that the timer Linux wanted to use was provided by KVM for performance
reasons, and it ran at actual speed while gem5 (m5 at the time) has its own
carefully controlled and usually much slower version of time. There may
have been other plumbing issues too like making sure interrupts were being
piped to the right places, but basically at a certain point into booting
things went crazy, Linux got upset and quit working.
As far as how to hook binary translation into gem5, it would be nice, but
there are a few problems. First, as Steve said, internal state isn't
necessarily that portable between the other environments and gem5. I'm less
optimistic than he is as far as the CPU state, but we agree that devices
are a major issue. There have been attempts at doing this sort of thing,
but I don't think any of them worked out well enough to become a permanent
part of things.
Another option would be to actually implement a binary translating CPU
model in gem5 which worked with everything else in the simulator by design,
rather than being bolted in after the fact. This would be a lot of work and
probably not be as good as other similar implementations just because a lot
of people have spent a lot of time on those. A gem5 version could probably
be decent, though, and would be better than not having anything. Also as
Steve said, it would be a *lot* of work. I'd want our ISA definitions to
somehow fit in with the interpreted and translated systems so that they'd
be consistent and we wouldn't have to separate implementations to maintain,
so those would have to be reworked as far as the underlying mechanism and
the descriptions themselves. That alone would be a fairly daunting task.
Then you'd have to also write the translation engine itself which would be
like rewriting qemu, basically. Actually hooking it into the rest of the
simulator would be more straightforward at that point.
In general, I agree with Steve. This sort of thing could work out, but it
would be really difficult to pull off and especially to do it well enough
for it to be considered a real implementation and not just a sort of
working proof of concept.
If you decide to do it anyway, my recommendation would be to either try to
hook KVM in as a CPU model or to implement binary translation from scratch
as a CPU model. KVM would probably be much more tractable in scope. No
matter what you do, be sure to discuss it with the dev list so we can help
you make the right decisions earlier rather than later so you don't have to
throw away a bunch of work. Also, you should avoid needing to change any
existing pieces of the simulator beyond what's absolutely necessary (ISA
descriptions are ok, for instance). That will keep your design cleaner, and
also avoid cluttering things up if/while your new work is in progress.
Gabe
Quoting Steve Reinhardt <[email protected]>:
Thanks for your interest in improving gem5!
The idea of doing binary translation to improve performance (particularly
for functional fast-forwarding) has come up before, but we haven't crossed
that bridge for several reasons:
1. Most of all, it's really really hard, and there are always plenty of
other more pressing things to work on.
2. Our current ISA descriptions weren't set up with this in mind, so it
would probably require reworking the ISA descriptions in addition to
building the framework.
3. Other groups (like QEMU and AMD's SimNow) have already built binary
translation tools that are way better than anything we would do.
4. For x86, at least, the idea of using hardware virtualization provides
an
alternative that could have even higher performance than binary
translation.
What we've generally been thinking of as a more desirable and achievable
alternative would be to interoperate with another environment like QEMU,
SimNow, or KVM so that you could run at high speed in one of these other
tools, extract the system state, load it into gem5, and then run a
detailed
simulation from there. Gabe Black did a little exploration of KVM quite a
while ago, but I don't think he got that far (correct me if I'm wrong,
Gabe). I also did a little internal playing around with SimNow but
nothing
I can release. Other than that I don't know of anyone who's worked on
this
yet.
Since the issues are pretty much the same, I'll use the term EE to refer
to
a high-speed emulated environment, whether it's QEMU, SimNow, KVM, or
something else.
In theory, it's pretty straightforward; architectural CPU and memory state
is pretty well defined, and most of these systems have checkpoint/snapshot
capability, so it's simply a matter of running in one of these EEs, saving
a checkpoint, and loading it up into gem5. The big challenge really
revolves around devices: the set of devices that gem5 supports doesn't
necessarily intersect with those that these EEs support, and the internal
state representation is guaranteed to be different.
I think the best solution to the device problem is to find a way to use
the
*same* device models in both the EE and in gem5, either by grafting the
EE's device models into gem5 or the other way around. For KVM, you'd have
to use gem5's models, since KVM by itself has no device models. For other
EEs, there are potential benefits to finding a way to port their device
models into gem5, since I expect they have more models (and more complete
models) than we do (certainly for SimNow I know that's true).
However, a big potential downside of incorporating other device models
is licensing. I know QEMU is GPL, which is problematic for us (since we
use a BSD-based license, and that's very important to us given the number
of companies involved with gem5). Anything that would contaminate gem5
with GPL is unacceptable. I haven't looked into QEMU enough to know if
this is something that can be worked around or not.
Also, while SimNow has a lot of appeal for those of us at AMD, I can see
where people would prefer an open-source and multi-ISA solution. SimNow
is
probably more feasible than you might think, though, since there is a free
binary version available (
http://developer.amd.com/**tools/simnow/pages/default.**aspx<http://developer.amd.com/tools/simnow/pages/default.aspx>),
and we have
contacts in the SimNow group to explore opening up additional internal
interfaces etc. if that proves necessary.
I think KVM might be the most appealing avenue; it does tie us even more
to
Linux than we are already, but that's the only major downside I see. It
also doesn't support all our ISAs, but Wikipedia says it does support
PowerPC in addition to x86, and there is an ARM port in the works (
http://systems.cs.columbia.**edu/projects/kvm-arm/<http://systems.cs.columbia.edu/projects/kvm-arm/>).
I expect that x86+ARM
covers the vast and growing majority of our user base.
Just to be complete, I'll mention that I'm sure there are opportunities to
improve the performance of the existing gem5 ISA simulation/emulation that
are simpler and more feasible than doing binary translation in gem5, but I
expect those opportunities are more like tens of percent speedup rather
than the order(s?) of magnitude or so you'd probably get out of going to
something like KVM.
I'd really be glad to see something along these lines happen, and am happy
to help to the extent I can. I'm also interested if some of the other
developers have a different opinion or further insights.
Steve
On Sun, Mar 25, 2012 at 7:39 PM, Pablo Ortiz <[email protected]>
wrote:
Hello dev group,
My group is looking at the possibility of improving the performance of
GEM5 for the purpose of simulating an Android environment. In QEMU, there
is a step performed during binary translation in which basic code blocks
are translated and cached to be executed to avoid the overhead of having
to
translate common, previously translated code blocks. Would such an
optimization be reasonably or doable or even sensible in the context of
GEM5. I would love to hear the thoughts of the mailing list. I would like
to thank, in advance, any who wish to respond to this email.
Cheers,
El
______________________________**_________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/listinfo/gem5-dev>
______________________________**_________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/listinfo/gem5-dev>
______________________________**_________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/listinfo/gem5-dev>