Hi Bayard,

06.05.2011 19:56, Bayard Bell пишет:
I've got 8 cores on my system, so I can hand it all over to the guests without 
sweating it. I'm looking at top, and I don't see any indication that other 
system load is contending. When I stop other apps running, it's only the amount 
of CPU idle time in the host that goes down, while the guest maintains the same 
level of CPU utilisation.
CPU isn't that easily "given" to the VM, as RAM pages, for example.
VirtualBox internally need to run few threads doing disk/network IO. Same situation with host OS too, so essentially some experiments is the best way to figure out how many vCPUs is reasonable to give to the guest to get best performance.


The load I'm running is compilation. There shouldn't be a lot of system time, 
but the build system I'm using schedules a higher level of parallel jobs than 
there is CPU, using both CPU count and memory size to determine the maximum 
number of jobs. What nevertheless seems odd is that when the Solaris guest 
thinks it's got 3 or 4 threads on CPU, utilisation is half what I'd expect.
With compilation, especially if you compile a lot of small files, significant part of load is fork/exec performance (and so, VMM in the guest), and of course,
IO does matter too.

Now, I can imagine a variety of reasons for this, plenty of which I don't 
properly or at all understand, but looking at CPUPalette.app (I'm not aware of 
anything on OS X that approximates the functionality of mpstat), it looks like 
the load on the system is being spread evenly across CPUs.
That's pretty much expected.
  My very naive reaction to this is that this isn't quite right, that 
VirtualBox should be trying to maintain processor affinity and pushing the CPU 
flat-out and not itself being subject to unnecessary additional SMP overhead, 
which is cumulative with the overhead of the guest.
It's up to host OS scheduler to maintain (soft) affinity of threads the way it thinks most reasonable. SMP overhead, such as need for TLB shootdown, couldn't be cured by forcing affinity, affinity would only help with CPU cache entries reuse, if some form of address space ID is used (or if switches happens inside same address space).

  (My understanding is that the ability to create CPU affinity in OS X is a bit 
weak compared to compared to Linux or Solaris [i.e. affinity is between threads 
and is meant to be defined by applications based on hw.cacheconfig and friends, 
whereas in Linux and Solaris it can be defined more strictly in terms of 
processors and processes].)
Don't think you really need that. As VBox doesn't do explicit gang scheduling, some assistance from host scheduler on that would be helpful, not explicit assignment of CPU affinity. In theory, good scheduler shall gang schedule threads with the same address space even without additional hints, as this will likely increase performance.
Not sure if OSX does that, although.


Nikolay

On 6 May 2011, at 11:17, Nikolay Igotti wrote:

   Hi Bayard,

Question is how do you generate load in guest, and what's are real bottlenecks. 
Generally, guest SMP maps to multiple
threads of execution for guest code, but mind page table synchronization, 
device access locks and other factors adding
overhead in SMP case sometimes more severe than it would be on the real box.

Also if your box has just 4 CPUs I wouldn't recommend assign all them to the 
guest.

  Thanks,
     Nikolay


Bayard Bell wrote:
Anyone?

On 23 Apr 2011, at 11:50, Bayard Bell wrote:


I've got an OpenSolaris guest that I'm using as a compile server, with Mac OS X 
Server as the host. I've assigned 4 CPUs to the guest, and the guest in fact 
sees 4 CPUs. From the host perspective, however, what I see is that the guest 
never ranges substantially above 200% (or 2 CPU) utilisation, even when the run 
queue is backed up and 4 processes appear to be on the CPU. I'm comparing the 
compile times to reference against other configurations, and what I'm seeing in 
VirtualBox leads me to believe that I'm being presented 4 CPUs but can't 
actually consume more than 2. I haven't made any apples-to-apples comparison 
yet, but this nevertheless seems to be able to keep the system running under 
load that can't  be sustained with only 2 CPUs assigned, which seems to 
indicate that the benefits of assigning more than 2 CPUs may be more about 
reducing context switching and CPU migration overhead on the guest than 
providing the full benefit of increased compute resources (or: IOW words the 
benefit seems equivalent to provide hyperthreaded virtual CPUs rather than 
cores).

Is this expected behaviour? I've looked through the documentation and wasn't 
able to find any information on this. I'm running 4.0.6 and also saw this 
behaviour on 4.0.4.

  ------------------------------------------------------------------------

_______________________________________________
vbox-dev mailing list
[email protected]
http://vbox.innotek.de/mailman/listinfo/vbox-dev





_______________________________________________
vbox-dev mailing list
[email protected]
http://vbox.innotek.de/mailman/listinfo/vbox-dev

Reply via email to