Roland Mainz wrote: > While thinking about how the QEmu performance in the "interpreter" mode > (e.g. emulating AMD64 on SPARC) could be improved I remebered that some > platforms have CPU-specific versions of libc&co. to improve the > performance... > ... the question would be: Is it usefull to add another libc variant to > Solaris which calls into the emulator code to "accerlate" functions like > |memcopy()|, |bzero()| etc. ... ? > IMO it could short-cut a 1MB copy (1048576 bytes... which may result in > at least 131072 emulated instructions (assuming 8byte/64bit transfers > per instruction, not counting any loop/conditional/etc. instructions). > If each emulation takes ~40 natve instructions in the host system we may > gain a factor of 40 in such memory operations (OkOk, this is just a very > raw estimation) ... :-) > > I guess that other emulators like Bochs or virtualilsation software like > VMware or Xen may be able to benefit from such an API, too (the > performance improvment would be much smaller than a factor of 40 but it > would sill save some CPU time) ...
One small clarification about VMware and Xen: VMware and Xen run native code on the native CPU but AFAIK all things related to the MMU call back into the virtualisation layer - which is quite expensive. A "block copy engine" (and a "zero block engine") in the virtualisation layer would be one call vs. >= 256 calls when you move a 1MB block with 4K pages - and therefore it is IMO a good idea to add a general API which can be used by VGMware&&Xen&&QEmu&&BOchs, too... ---- Bye, Roland P.S.: Can anyone forward this to the Xen people at Sun ? -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)