I think you are missing several things. (And jiffies are in 10ms increments, not 1ms increments)
First, have you verified the accuracy of your CPU numbers? Does RHEL4 include the steal timer patch and is it working correctly on both VMWare and z/VM? Even with current levels of Linux and z/VM, the CPU numbers must be corrected. 2nd, what is your target peak cpu utilization for VMWare, and for z/VM? I would expect 50% for VMWare (Being very generous i think) and 90% for z/VM. So z/VM, the processors get an extra 80% of CPU seconds. 3rd, It looks like you are measuring one "batch" process. Real work would have lots of processes, switching between workloads at even 1000's of times per second. The cache technology in the z10 will be vastly superior, and will provide better CPU numbers when measuring an environment closer to a production reality. Stewart Thomas J wrote:
Need some assistance on understanding a workload comparison. Here is what we have: We run a business workload (Java/WebSphere) for one week on an HP DL585 G5 server four Quad-Core
AMD Opteron Processors, model 8389 (2.9GHz) on Red Hat Enterprise Linux 4 kernel version 2.6.9. This is virtualized under VMware ESX. Using /proc/$$/stat, we see that our process id consumed 23,525 seconds of "cpu time". We are basing this "cpu time" on the utime/stime values (from issuing a cat against /proc/$$/stat). Our understanding is that this is giving us the total jiffies consumed, and we are then dividing this by 1000 since the jiffy timer is a millisecond. That is how we calculated the "cpu time" in seconds.
We ran this same load on a System z10 EC for a week. This is a z/VM 5.3 LPAR with RHEL4
running as a guest. On the mainframe, we see that our process id consumed 25,649 seconds of "cpu time".
We generated what we call an equivalence factor: 23,525 / 25,649 = 0.9172 Based on this, we believe that we'll need ~10% more z10 CPU cores to process our workload
than we would on our comparison platform.
Question for the audience is - are we not understanding jiffies or the /proc/$$/stat timers
for cpu calculation correctly? Wondering if we might be missing something insanely obvious in comparing cpu time (cores) in this fashion, or if this does seem reasonable for a Java/WebSphere workload.
For reference, we have someone in doing a TCO for our workload using generalized spreadsheets
for the calculations and we are using our internal comparison and the numbers are way off for the total estimated IFL count. For an example of what I'm talking about here, say we have 68 x86 cores for this workload. During overlapping peak times we are totally consuming 30 of these. Based on our equivalence factor calculation above, we are saying that we'll need >30 IFLs to handle these peaks. Based on the generalized spreadsheet calculations from those doing the TCO they claim we can run this peak workload in 8 IFLs. So essentially my main question from your experiences are if our own calculations make more sense or if the generalized spreadsheet can/cannot be trusted for accuracy.
Any advice or experiences would be welcomed. Tom Stewart Mainframe OS, Networking & Security Deere & Company Computer Center www.johndeere.com ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390