Re: [Qemu-devel] Update on TCG Multithreading

2014-12-02 Thread Dr. David Alan Gilbert
* Mark Burton (mark.bur...@greensocs.com) wrote:
 
 All - first a huge thanks for those who have contributed, and those who have 
 expressed an interest in helping out.
 
 One issue I???d like to see more opinions on is the question of a cache per 
 core, or a shared cache.
 I have heard anecdotal evidence that a shared cache gives a major performance 
 benefit???.
 Does anybody have anything more concrete?
 (of course we will get numbers in the end if we implement the hybrid scheme 
 as suggested in the wiki - but I???d still appreciate any feedback).
 
 Our next plan is to start putting an implementation plan together. Probably 
 quite sketchy at this point, and we hope to start coding shortly.

I'd expect a shared one to be able to take advantage
of code that's translated by one core and then used on
another.
On the other hand with one per core you can perform updates
on the caches with a lot less locking; however you've still
got to be able to do invalidates across all the caches if any
core does the write, and that could also get tricky.

Dave

 
 
 Cheers
 
 Mark.
 
 
 
 
 
+44 (0)20 7100 3485 x 210
  +33 (0)5 33 52 01 77x 210
 
   +33 (0)603762104
   mark.burton
  applewebdata://FB8B3C00-B344-43B7-AF3D-1618ECF92219/www.greensocs.com
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] Update on TCG Multithreading

2014-12-02 Thread Kirill Batuzov
On Mon, 1 Dec 2014, Mark Burton wrote:
 
 One issue I’d like to see more opinions on is the question of a cache per 
 core, or a shared cache.
 I have heard anecdotal evidence that a shared cache gives a major performance 
 benefit….
 Does anybody have anything more concrete?

There is a theoretical and experimental comparison of these approaches in
PQEMU article (you've cited it on wiki page). Only the authors call them
differently: they call cache-per-core Separate Code Cache (SCC) and
they call shared cache Unified Code Cache (UCC).

-- 
Kirill

Re: [Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Lluís Vilanova
Mark Burton writes:

 All - first a huge thanks for those who have contributed, and those who have
 expressed an interest in helping out.

 One issue I’d like to see more opinions on is the question of a cache per 
 core,
 or a shared cache.
 I have heard anecdotal evidence that a shared cache gives a major performance
 benefit….
 Does anybody have anything more concrete?
 (of course we will get numbers in the end if we implement the hybrid scheme as
 suggested in the wiki - but I’d still appreciate any feedback).

I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
can then have its own methods for working with updates, making it much simpler
to work with different implementations, like completely avoiding locks (per-cpu
cache) or a hybrid approach like the one described in the wiki.


 Our next plan is to start putting an implementation plan together. Probably
 quite sketchy at this point, and we hope to start coding shortly.

BTW, I've added some links to the COREMU project, which was discussed long ago
in this list.


Best,
  Lluis

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



Re: [Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Alexander Graf


On 01.12.14 22:00, Lluís Vilanova wrote:
 Mark Burton writes:
 
 All - first a huge thanks for those who have contributed, and those who have
 expressed an interest in helping out.
 
 One issue I’d like to see more opinions on is the question of a cache per 
 core,
 or a shared cache.
 I have heard anecdotal evidence that a shared cache gives a major performance
 benefit….
 Does anybody have anything more concrete?
 (of course we will get numbers in the end if we implement the hybrid scheme 
 as
 suggested in the wiki - but I’d still appreciate any feedback).
 
 I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
 can then have its own methods for working with updates, making it much simpler
 to work with different implementations, like completely avoiding locks 
 (per-cpu
 cache) or a hybrid approach like the one described in the wiki.

I don't think you want to have indirect function calls in the fast path ;).


Alex



Re: [Qemu-devel] Update on TCG Multithreading

2014-12-01 Thread Lluís Vilanova
Alexander Graf writes:

 On 01.12.14 22:00, Lluís Vilanova wrote:
 Mark Burton writes:
 
 All - first a huge thanks for those who have contributed, and those who have
 expressed an interest in helping out.
 
 One issue I’d like to see more opinions on is the question of a cache per 
 core,
 or a shared cache.
 I have heard anecdotal evidence that a shared cache gives a major 
 performance
 benefit….
 Does anybody have anything more concrete?
 (of course we will get numbers in the end if we implement the hybrid scheme 
 as
 suggested in the wiki - but I’d still appreciate any feedback).
 
 I think it makes sense to have a per-core pointer to a qom TCGCacheClass. 
 That
 can then have its own methods for working with updates, making it much 
 simpler
 to work with different implementations, like completely avoiding locks 
 (per-cpu
 cache) or a hybrid approach like the one described in the wiki.

 I don't think you want to have indirect function calls in the fast path ;).

Ooops, true; at least probably, since you're never sure how much the HW
prefetcher is going to outsmart you :)

Well, I guess that a define will have to do then. But I think it still makes
sense to refactor tb_* functions and such to have a TCGCache as first argument.


Best,
  Lluis

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth