Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread John-Mark Gurney
Alan Somers wrote this message on Tue, Jul 24, 2018 at 15:30 -0600:
> What are people's experiences with overcommitting CPUs in BHyve?  I have an
> 8-core machine that often runs VMs totalling up to 5 allocated CPUs without
> problems.  But today I got greedy.  I assigned 8 cores to one VM for a big
> build job.  Obviously, some of those were shared with the host.  I also
> assigned it 8GB of RAM (out of 16 total).  Build performance fell through
> the floor, even though the host was idle.  Eventually I killed the build
> and restarted it with a more modest 2 make jobs (but the VM still had 8
> cores).  Performance improved.  But eventually the system seemed to be
> mostly hung, while I had a build job running on the host as well as in the
> VM.  I killed both build jobs, which resolved the hung processes.  Then I
> restarted the host's build alone, and my system completely hung, with
> top(1) indicating that many processes were in the pfault state.
> 
> So my questions are:
> 1) Is it a known problem to overcommit CPUs with BHyve?

Likely as someone else mentioned the spin lock problem...  It's best if
you can schedule ALL vCPUs at the same time, but obviously the more vCPUs
the harder this becomes, and I don't believe that FreeBSD has a scheduler
that allows you to do this.

The late Benjamin Perrault (iirc) said that his limit was 7 vCPU's per
CPU, I don't remember if that was core or threads (likely core)..  But
I also don't know his work load, or vCPUs per VM...

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread Rodney W. Grimes
> What are people's experiences with overcommitting CPUs in BHyve?  I have an
> 8-core machine that often runs VMs totalling up to 5 allocated CPUs without
> problems.  But today I got greedy.  I assigned 8 cores to one VM for a big
> build job.  Obviously, some of those were shared with the host.  I also
> assigned it 8GB of RAM (out of 16 total).  Build performance fell through
> the floor, even though the host was idle.  Eventually I killed the build
> and restarted it with a more modest 2 make jobs (but the VM still had 8
> cores).  Performance improved.  But eventually the system seemed to be
> mostly hung, while I had a build job running on the host as well as in the
> VM.  I killed both build jobs, which resolved the hung processes.  Then I
> restarted the host's build alone, and my system completely hung, with
> top(1) indicating that many processes were in the pfault state.
> 
> So my questions are:
> 1) Is it a known problem to overcommit CPUs with BHyve?
> 2) Could this be related to the pfault hang, even though the guest was idle
> at the time?

I on occasion do over commit vCPU's in bhyve, but I do so
with a few specific conditions:

1) I count CPU's as real Cores, not Hyperthread cores, I do not
   expect hyperthreading to work well in over commit.

2) I always wire my VM's memory, I NEVER overcommit memory, that
   just leads to bad and ugly.  (-S option to bhyveload and bhyve)
   This totally takes ARC issues out of the picture, but you may
   not be able to start your VM's if you dont decrease the ARC.

3) Watch out for host side disk drive IOP saturation, you can easily
   stahly your guests if your trying to do to much I/O, they usually
   recover from this on there own, though it can make things go
   pretty slow for a time.   Firing off 16 VM's doing "nightly" on
   a single spindle host is a sure way to have some very long runs.

My work load runs from an always running 6 vCPU light load,
to an occasion guest running make -j4 buildworlds (total of 10 vCPU
load).  My host has 4 cores, 8 threads.  The 10vCPU load usually
drives the host to a load average of 5.

The 6 vCPU always present light load very rarely drives the
load above 1.

I think the secret sauce is wired memory :-)

-- 
Rod Grimes rgri...@freebsd.org
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread Jason Tubnor
On Wed, 25 Jul 2018 at 08:12, Shawn Webb  wrote:

> On Tue, Jul 24, 2018 at 03:30:32PM -0600, Alan Somers wrote:
> > What are people's experiences with overcommitting CPUs in BHyve?  I have
> an
> > 8-core machine that often runs VMs totalling up to 5 allocated CPUs
> without
> > problems.  But today I got greedy.  I assigned 8 cores to one VM for a
> big
> > build job.  Obviously, some of those were shared with the host.  I also
> > assigned it 8GB of RAM (out of 16 total).  Build performance fell through
> > the floor, even though the host was idle.  Eventually I killed the build
> > and restarted it with a more modest 2 make jobs (but the VM still had 8
> > cores).  Performance improved.  But eventually the system seemed to be
> > mostly hung, while I had a build job running on the host as well as in
> the
> > VM.  I killed both build jobs, which resolved the hung processes.  Then I
> > restarted the host's build alone, and my system completely hung, with
> > top(1) indicating that many processes were in the pfault state.
> >
> > So my questions are:
> > 1) Is it a known problem to overcommit CPUs with BHyve?
> > 2) Could this be related to the pfault hang, even though the guest was
> idle
> > at the time?
>

1) Not that I have experienced.
2) More likely RAM pressure.  Are you running ZFS?  What is you ARC capped
at? (Total guest + System  + ARC < System Total Ram)


> VMWare's ESXi uses a special scheduler to do what it does. I wonder if
> it would be worthwhile to investigate implementing a scheduler in
> FreeBSD that provides decent performance for virtualized workloads.
>
>
>
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread Alan Somers
An anonymous BHyve expert has explained things to me off-list.  Details
below.

On Tue, Jul 24, 2018 at 3:30 PM, Alan Somers  wrote:

> What are people's experiences with overcommitting CPUs in BHyve?  I have
> an 8-core machine that often runs VMs totalling up to 5 allocated CPUs
> without problems.  But today I got greedy.  I assigned 8 cores to one VM
> for a big build job.  Obviously, some of those were shared with the host.
> I also assigned it 8GB of RAM (out of 16 total).  Build performance fell
> through the floor, even though the host was idle.  Eventually I killed the
> build and restarted it with a more modest 2 make jobs (but the VM still had
> 8 cores).  Performance improved.  But eventually the system seemed to be
> mostly hung, while I had a build job running on the host as well as in the
> VM.  I killed both build jobs, which resolved the hung processes.  Then I
> restarted the host's build alone, and my system completely hung, with
> top(1) indicating that many processes were in the pfault state.
>
> So my questions are:
> 1) Is it a known problem to overcommit CPUs with BHyve?
>

Yes it's a problem, and it's not just BHyve.  The problem comes from stuff
like spinlocks.  Unlike normal userland locks, when two CPUs contend on a
spinlock both are running at the same time.  When two vCPUs are contending
on a spinlock, the host has no idea how to prioritize them.  Normally
that's not a problem, because physical CPUs are always supposed to be able
to run.  But when you overcommit vCPUs, some of them must get swapped out
at all times.  If a spinlock is being contended by both a running vCPU and
a swapped out vCPU, then it might be contended for a long time.  The host's
scheduler simply isn't able to fix that problem.  The problem is even worse
when you're using hyperthreading (which I am) because those eight logical
cores are really only four physical cores, and spinning on a spinlock
doesn't generate enough pipeline stalls to cause a hyperthread switch.  So
it's probably best to stick with the n - 1 rule.  Overcommitting is ok if
all guests are single-cored because then they won't use spinlocks.  But my
guests aren't all single-cored.

2) Could this be related to the pfault hang, even though the guest was idle
> at the time?
>

The expert suspects the ZFS ARC was competing with the guest for RAM.
IIUC, ZFS will sometimes greedily grow its ARC by swapping out idle parts
of the guest's RAM.  But the guest isn't aware of this behavior, and will
happily allocate memory from the swapped-out portion.  The result is a
battle between the ARC and the guest for physical RAM.  The best solution
is to limit the maximum amount of RAM used by the ARC with the
vfs.zfs.arc_max sysctl.

More info: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222916

Thanks to everyone who commented, especially the Anonymous Coward.

-Alan
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread Allan Jude
On 2018-07-24 17:30, Alan Somers wrote:
> What are people's experiences with overcommitting CPUs in BHyve?  I have an
> 8-core machine that often runs VMs totalling up to 5 allocated CPUs without
> problems.  But today I got greedy.  I assigned 8 cores to one VM for a big
> build job.  Obviously, some of those were shared with the host.  I also
> assigned it 8GB of RAM (out of 16 total).  Build performance fell through
> the floor, even though the host was idle.  Eventually I killed the build
> and restarted it with a more modest 2 make jobs (but the VM still had 8
> cores).  Performance improved.  But eventually the system seemed to be
> mostly hung, while I had a build job running on the host as well as in the
> VM.  I killed both build jobs, which resolved the hung processes.  Then I
> restarted the host's build alone, and my system completely hung, with
> top(1) indicating that many processes were in the pfault state.
> 
> So my questions are:
> 1) Is it a known problem to overcommit CPUs with BHyve?
> 2) Could this be related to the pfault hang, even though the guest was idle
> at the time?
> 
> -Alan
> ___
> freebsd-virtualization@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
> To unsubscribe, send any mail to 
> "freebsd-virtualization-unsubscr...@freebsd.org"
> 

Bhyve has a command line flag, -p, to let you pin a vCPU to a physical
CPU. This might avoid some of the issues with the threads hopping around
all the time.

If you were anyone else, I'd also ask if you ensured your
vfs.zfs.arc_max was low enough to actually leave some ram for the VM to use.

-- 
Allan Jude



signature.asc
Description: OpenPGP digital signature


Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread Ruben
Using bhyve in several setups. Only one that is overcommittet is a 
quadcore AMD cpu (A4-5000) with 16 gb ram that runs 14 vms (2 of them 
dualcore i think).


Rock solid.


On 07/24/2018 11:30 PM, Alan Somers wrote:

What are people's experiences with overcommitting CPUs in BHyve?

___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Overcommitting CPUs with BHyve?

2018-07-24 Thread Shawn Webb
On Tue, Jul 24, 2018 at 03:30:32PM -0600, Alan Somers wrote:
> What are people's experiences with overcommitting CPUs in BHyve?  I have an
> 8-core machine that often runs VMs totalling up to 5 allocated CPUs without
> problems.  But today I got greedy.  I assigned 8 cores to one VM for a big
> build job.  Obviously, some of those were shared with the host.  I also
> assigned it 8GB of RAM (out of 16 total).  Build performance fell through
> the floor, even though the host was idle.  Eventually I killed the build
> and restarted it with a more modest 2 make jobs (but the VM still had 8
> cores).  Performance improved.  But eventually the system seemed to be
> mostly hung, while I had a build job running on the host as well as in the
> VM.  I killed both build jobs, which resolved the hung processes.  Then I
> restarted the host's build alone, and my system completely hung, with
> top(1) indicating that many processes were in the pfault state.
> 
> So my questions are:
> 1) Is it a known problem to overcommit CPUs with BHyve?
> 2) Could this be related to the pfault hang, even though the guest was idle
> at the time?

VMWare's ESXi uses a special scheduler to do what it does. I wonder if
it would be worthwhile to investigate implementing a scheduler in
FreeBSD that provides decent performance for virtualized workloads.

Thanks,

-- 
Shawn Webb
Cofounder and Security Engineer
HardenedBSD

Tor-ified Signal:+1 443-546-8752
Tor+XMPP+OTR:latt...@is.a.hacker.sx
GPG Key ID:  0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89  3D9E 6A84 658F 5245 6EEE


signature.asc
Description: PGP signature


Overcommitting CPUs with BHyve?

2018-07-24 Thread Alan Somers
What are people's experiences with overcommitting CPUs in BHyve?  I have an
8-core machine that often runs VMs totalling up to 5 allocated CPUs without
problems.  But today I got greedy.  I assigned 8 cores to one VM for a big
build job.  Obviously, some of those were shared with the host.  I also
assigned it 8GB of RAM (out of 16 total).  Build performance fell through
the floor, even though the host was idle.  Eventually I killed the build
and restarted it with a more modest 2 make jobs (but the VM still had 8
cores).  Performance improved.  But eventually the system seemed to be
mostly hung, while I had a build job running on the host as well as in the
VM.  I killed both build jobs, which resolved the hung processes.  Then I
restarted the host's build alone, and my system completely hung, with
top(1) indicating that many processes were in the pfault state.

So my questions are:
1) Is it a known problem to overcommit CPUs with BHyve?
2) Could this be related to the pfault hang, even though the guest was idle
at the time?

-Alan
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"