Hello,

I also thought that enabling SMT will degrade performance of a single
threaded process, but I could not find anyone to confirm that. Thank you
Christian.

Alan:
Postgres is not constrained by having only 4 cores. It's running fine with
1 vCPU. The problem is with overall Openshift Cluster CPU count. Now we
have 3 Worker Nodes, 4 vCPU each = 12 vCPU. We can schedule max 7 postgres
workloads.

Our test shows - running 7 identical postgres workloads on Openshift
simultaneously, we get to 66% IFL Utilization on CEC (we have 4 IFL). This
means there is still some "horsepower" available on the Mainframe, right ?

How to properly use zVM and zLinux virtualization technology to increase
Openshift Cluster capacity from 12 vCPU to 24 vCPU, without degrading
performance. Which of the approaches below is correct ?
- to have 6 Worker Nodes (zLinux), 4 vCPU each ?
- to have 1 Worker Node with 24 vCPU ? (lets forget about single point of
failure for now)

I'm trying to understand - does it make any difference which option I take
? At the end its vCPUs fighting for real CPU time on CEC (and for option 2
I avoid OS overhead and OS management)

All the best,
Mariusz



pon., 3 sie 2020 o 08:29 Christian Borntraeger <borntrae...@linux.ibm.com>
napisał(a):

> On 02.08.20 06:36, Alan Altmark wrote:
> >
> > A physical core has a certain amount of “horsepower” in it.  It can, at
> top
> > speed do X amount of work.
> >
> > In SMT, you split the core in half, creating two execution contexts
> (CPUs)
> > instead of just one.  The two CPUs share resources on the physical core,
> > but the total horsepower doesn’t increase. In fact, it gets a little
> > smaller in the sense that the core must now spend cycles managing the two
> > CPUs (threads) on it.   Some workloads need more threads. Other workloads
> > need faster CPUs. So you choose between SMT (threads) or non-SMT (speed).
> > Knowing which is best means measuring workload response times.
>
> To tell some more details: The sum of both SMT threads is usually larger
> than one single thread. This is because a CPU does have many execution
> units
> (floating point, fixed point, etc). Now the CPU can only execute things
> where
> all dependencies are resolved.So several units are sitting idle, e.g. when
> there
> was a wrong branch prediction until the pipeline has enough things in the
> out
> of order window again. With SMT there are now 2 independent dependency
> tracking
> streams that can make use of the execution units.
>
> So as a rule of thumb: IF you have enough parallel threads and you are
> bound
> by overall capacity, enabling SMT is usually a win for throughput. The z15
> technical guide says 25% on average over single thread. As an example that
> could mean instead of 100% you get for example 60% + 65%.
>
> What Alan tried to tell is that this of course DOES have an impact on
> latency.
> When one single thread only gets lets say 65% the latency is larger.
> So you balance latency vs throughput. And if you only have one thread on
> the
> whole system, then this thread would be faster without SMT.
>
> Now as latency might also depend on the questions "do I get ressources at
> all"
> I also think that for highly virtualized systems with many guests and
> overcommitment of CPUs, SMT is usually a win as z/VM or KVM have more
> threads to
> distribute load. There is even some math for queueing systems that can show
> reduced wait time for an idealized workload.
>
> There are cases where SMT is even worse, though. Some workloads really
> go to the limit of execution units and if you have two of these workloads
> splitting the overall number of lets say rename registers might actually
> hurt
> performance so that the sum is smaller than just one single thread. The
> CPUs
> got better over time and from z13->z14 and from 14 to z15 we identified
> several
> of these cases and improved the CPUs. So on z14 and z15 SMT is a win most
> of
> the time.
>
>
> [...]
> > For these reasons, you generally do not want to have more virtual CPUs
> in a
> > guest than you have logical CPUs to run them on.
>
> Absolutely. Having more virtual than physical never makes sense apart from
> corner
> cases like testing. But this statement is mostly for a single guest or a
> single
> LPAR.
>
> The sum of all virtual cpus can be higher as long as there is enough idle
> time,
> e.g. not all cpus run 100% all the time.
> For example with 4IFLs and SMT you can have 8 vcpus active at the same
> time. With
> lots of idle systems that could also mean lets say 100 guests with 1vcpu
> each.
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to