RE: CPU Capacity Planning

Boris Dali Sun, 07 Dec 2003 09:01:47 -0800

Thanks a lot for the reply, Cary. Yes, your
explanation makes all the sense in the world even
though it is precisely the weighted average approach
that I've seen on some capacity planning spreadsheets.


Two additional questions if I may, Cary.
Would it be correct to say that when I throw
additional users on a system it is only queueing
component of a response time that climbs up, while
service time stays the same? If that's true, than does
it matter how I measure service time of my Bus.Tx1 -
on a loaded system where hundreds of users run this
operation or when nobody executes it all? Also is it
important to have the other two operations - Bus.Tx2
and Bus.Tx3 - running concurrently (as they would in a
real life) for the c measurements?

In other words assuming I have an identical replica of
a production environment where I am the only user -
would service time/rate measured there be applicable
for a loaded system with heterogeneous workload?



And another stupid question.
Knowing individual business tx. characteristics
(response time, number of CPUs required to comply with
SLA requirements, average utilization per CPU, etc),
how does one go about sizing the box in terms of the
overall "system" required CPU capacity? Or put it
another way - what do I tell a hardware vendor?

That is, if what comes out of a queueuing exercise is:
           m       pho
         --------  ---
Bus.Tx1   2-way    70%
Bus.Tx2   3-way    50%
Bus.Tx3   4-way    80%

What should be the optimistic (let's assume perfect
liner CPU scalability for now) recommendation to
decision makers in terms of the horsepower required to
run this "system" on? 
After all, yes individual business transactions have
their own SLA requirements (e.g. worst tolerated
response time), but they all use the same resources,
don't they? So even though a service time of Bus.Tx1
might remain constant the queueing delay (and hence
the response time) would likely to increase due to
other concurrent activities on the system. Is there a
way to account for this if capacity planning is done
at the individual bus.tx level?

Thanks,
Boris Dali.

 --- Cary Millsap <[EMAIL PROTECTED]> wrote: >
Boris,
> 
> If you mean that some people on your system execute
> Bus.Tx1, some others
> execute Bus.Tx2, and some others (maybe with some
> overlap) execute
> Bus.Tx3, then my answer to your question is:
> 
>       No, I would strongly encourage you *not* to do
> this!
> 
> It was exercises like this that first led me to
> discover the fact that
> there's no such thing as a "system" in the sense
> that most people use
> the term (that is, as a big mishmash of different
> transactions, in which
> averages have any real meaning).
> 
> Combining your three CDFs will hurt you in the way
> described in "Why
> understanding distribution is important" on
> pp238-239 of the Optimizing
> Oracle Performance book. Here's another example:
> Imagine the following
> "system"...
> 
>        avg.       avg.
>      runs/day   sec/run   who uses it
> Tx1   10,000        1       Group A
> Tx2    1,000       10       Group B
> Tx3      100      100       Group C
> 
> So, what's this "system's" average response time? A
> naīve
> "mathematician" might think it's the weighted
> average of all the
> response times: (10000*1 + 1000*10 + 100*100) /
> (10000+1000+100) = 2.7
> sec. But what use is this figure? Nobody's response
> time is "ever"
> really 2.7 sec. <footnote>I say "ever" here because
> it's of course
> possible that a program whose "avg. sec/run" is 1
> (or even 10) will
> occasionally have a true response time of
> 2.7.</footnote>
> 
> If you're *anybody* actually using the system, the
> number "2.7 sec/run"
> is just stupid! The 2.7s figure is especially
> ludicrous if you're a
> member of Group B or C, because your average
> response time is either
> really 3.7x that number (B) or 37x that number (C)!
> The mathematical
> explanation for the stupid-looking-ness is that, no
> matter what you're
> doing, this 2.7 number is an average influenced by
> stuff that you're
> *not* doing.
> 
> There is no such thing as an "average user" (any
> more than there's an
> American family with 2.3 children); in this example,
> there are only
> members of Groups A, B, and C. What if you're a
> member of two groups
> simultaneously (e.g., you run different transaction
> types in the same
> day)? It's the same problem, because your
> expectation of, for example,
> Tx1 response time is completely different from your
> expectation of Tx3
> response time. Clumping response times from Tx1 and
> Tx2 into one average
> makes no sense even then. Expecting *anything* you
> do to take 2.7
> sec/run is going to leave you unfulfilled.
> 
> The number 2.7 has no useful meaning here.
> Certainly, if you have some
> kind of service level agreement (SLA) wired to the
> number 2.7 sec/run in
> this case, then I would say you have a SLA that's
> worse than having no
> SLA at all.
> 
> Maybe an easier analogy is this... How would you
> respond to the
> question, "Using the global air transportation
> system, how long does it
> take to fly someplace?" One way to answer would be
> to compute a weighted
> average of all flight durations recorded by IAPA for
> the past 12 months.
> Imagine that the worldwide average is 2.7 hours. How
> much good does this
> do someone who really wants to know how long it
> takes to get from
> Chicago to Sydney? How 'bout Chicago to Detroit? No,
> it's fundamentally
> the wrong way to respond. The right way to respond
> to "How long does it
> take to fly someplace?" begins with asking the
> question "From where to
> where?"
> 
> The "problem with averages" comes when the
> statistics you're trying to
> average don't come from a single well-behaved
> distribution. See
> pp236-254 of "Optimizing Oracle Performance" for a
> more complete
> explanation of what I mean by this.
> 
> Virtually every computer system used by participants
> on this list has
> two or more transaction types whose performance
> characteristics do not
> come from a single well-behaved statistical
> distribution. On these
> systems, it is impossible to come up with a single
> number (an average)
> that will present a useful description of your
> "system." And I mean
> "impossible" in the strictest, most carefully
> considered sense.
> 
> Bottom line: Do not attempt to combine your Bus.Tx#
> data in the way you
> describe.
> 
> 
> Cary Millsap
> Hotsos Enterprises, Ltd.
> http://www.hotsos.com
> 
> Upcoming events:
> - Performance Diagnosis 101: 12/16 Detroit, 1/27
> Atlanta
> - SQL Optimization 101: 12/8 Dallas, 2/16 Dallas
> - Hotsos Symposium 2004: March 7-10 Dallas
> - Visit www.hotsos.com for schedule details...
> 
> 
> -----Original Message-----
> Boris Dali
> Sent: Saturday, December 06, 2003 11:59 AM
> To: Multiple recipients of list ORACLE-L
> 
> Let's say I have 3 business transactions (consisting
> of numerous Oracle transactions each) and I know
> total
> service time for each (from c readings off sql
> traces
> for the length of the bus.tx). Doing queuing theory
> exercise I can also get CDF(r max) for each. Let's
> say
> 
> Bus.Tx1 - CPU time=5s   CDF(r)=97%
> Bus.Tx2 - CPU time=8s   CDF(r)=95%
> Bus.Tx3 - CPU time=10s  CDF(r)=90%
> 
> How can I combine these three together and make any
> conclusions as to what the overall CDF(r) would be
> for
> the whole system consisting of the above 3 business
> transactions? Is this doable?
> 
> Thanks,
> Boris Dali.
=== message truncated === 

______________________________________________________________________ 
Post your free ad now! http://personals.yahoo.ca
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Boris Dali
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

RE: CPU Capacity Planning

Reply via email to