Allan - "...count the beans differently...' Yes, I'm beginning to get used to that concept. For instance, with the CPU Utilisation data that I *have* been able to retrieve, the metric given is not 'CPU%', but 'Number of cores'. I'm having to do some rapid re-orienting to my way of thinking. As for the memory size, I've got "mem-gb" : 2 defined in my start.json file, but I've not seen any indication of paging load at all in my testing.
Michael - 5 zIIPs? I wish! Nope - these are all general-purpose processors. The z/OS system I'm using is a z/VM guest on a system run by an external supplier, so I'm not sure if defining zIIPs would actually achieve anything (Is it possible to dedicate a zIIP engine to a specific z/VM guest? That's a road I've not yet gone down). With regard to the WLM definitions, I followed the advice in the red book and I'm reasonably certain I've got it right. Having said that, cross-refer to a thread that I started earlier this week, titled "WLM Query" The response to that led to me defining a resource group to cap the started task to 10MSU, which resulted in a CPU% Util value of roughly 5% - something I could be happy with. Under that cap, the started task ran, yes, but it ran like a three-legged dog (my apologies to limb-count-challenged canines). Start-up of the task, from the START command to the "server is listening..." message took over an hour, and STOP-command-to-task-termination took approx. 30 minutes. (SSH-ing to the task was a bit of a joke, too. Responses to simple commands like 'docker ps -a' could be seen 'painting' across the screen, character-by-character...) As a result, I've moved away from trying to limit the task for the time being. I'm concentrating on attempting to get cadvisor to be a bit less greedy. Regards Sean On Wed, 26 Aug 2020 at 13:49, Michael Babcock <bigironp...@gmail.com> wrote: > I can’t check my zCX out right now since my internet is down. > > You are running these on zIIP engines correct? Must be nice to have 5 > zIIPs! And have the WLM parts in place? Although it probably wouldn’t > make much difference during startup/shutdown. > > On Wed, Aug 26, 2020 at 3:40 AM Sean Gleann <sean.gle...@gmail.com> wrote: > > > Can anyone offer advice, please, with regard to monitoring the system > > > > resource consumption of a zcx Container task? > > > > > > > > I've got a zcx Container task running on a 'sandbox' system where - as > yet > > > > - I'm not collecting any RMF/SMF data. Because of that, my only source of > > > > system usage is the SDSF DA panel. I feel that the numbers I see there > > > > are... 'questionable' is the best word I can think of. > > > > > > > > Firstly, the EXCP-count for the task goes up to about 15360 during the > > > > initial start-up phase, but then it stays there until the STOP command is > > > > issued. At that point, EXCP-count starts rising again, until the task > > > > finally terminates. The explanation for that is probably because all the > > > > I/O is being handled internally at the 'Linux' level - the task must be > > > > doing *some* I/O, right? - but the data isn't getting back to SDSF for > some > > > > reason. Without the benefit of SMF data to examine, I'm wondering if this > > > > is part of a larger problem. > > > > > > > > The other thing that troubles me is the CPU% busy value. My sandbox > system > > > > has 5 engines defined, and in the 'start.json' file that controls the zcx > > > > Container task, I've specified a 'cpu' value of 4. During the start-up > > > > phase for the Container started task, SDSF shows CPU% values of approx > 80%, > > > > but when the task is finally initialised, this drops to 'tickover' rates > of > > > > about 1%. I'm happy with that - the initial start-up of *any* task as > > > > complex as a zcx Container is likely to cause high CPU usage, and the > > > > subsequent drop to the 1% levels is fine by me. > > > > > > > > But... Once the Container task is started and I've ssh'd into it, I then > > > > want to monitor its 'internal' system consumption. I've been using the > > > > 'Getting Started...' redbook as my guide throughout all this project, and > > > > it talks about using "Nodeexporter", "Cadvisor", "Prometheus" and > "Grafana" > > > > as tools for this. I've got all those things installed and I can start > and > > > > stop them quite happily, but I've found that using Cadvisor on it's own > can > > > > drive CPU% levels back up to 80% for the entire time it is running. If a > > > > system is running flat-out when all it is doing is monitoring itself, > well, > > > > there's something wrong somewhere... I'm trying to find an idiot's guide > to > > > > controlling what Cadvisor does, but as yet I've been unsuccessful. > > > > > > > > Regards > > > > Sean > > > > > > > > ---------------------------------------------------------------------- > > > > For IBM-MAIN subscribe / signoff / archive access instructions, > > > > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN > > > > -- > Michael Babcock > OneMain Financial > z/OS Systems Programmer, Lead > > ---------------------------------------------------------------------- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN > ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN