Re: ZCX task monitoring, anyone?

Sean Gleann Wed, 26 Aug 2020 06:44:24 -0700

Allan - "...count the beans differently...' Yes, I'm beginning to get used
to that concept. For instance, with the CPU Utilisation data that I *have*
been able to retrieve, the metric given is not 'CPU%', but 'Number of
cores'. I'm having to do some rapid re-orienting to my way of thinking.
As for the memory size, I've got "mem-gb" : 2 defined in my start.json
file, but I've not seen any indication of paging load at all in my testing.


Michael - 5 zIIPs?   I wish!  Nope - these are all general-purpose
processors.
The z/OS system I'm using is a z/VM guest on a system run by an external
supplier, so I'm not sure if defining zIIPs would actually achieve anything
(Is it possible to dedicate a zIIP engine to a specific z/VM guest? That's
a road I've not yet gone down).
With regard to the WLM definitions, I followed the advice in the red book
and I'm reasonably certain I've got it right. Having said that, cross-refer
to a thread that I started earlier this week, titled "WLM Query"
The response to that led to me defining a resource group to cap the
started task to 10MSU, which resulted in a CPU% Util value of roughly 5% -
something I could be happy with.
Under that cap, the started task ran, yes, but it ran like a three-legged
dog (my apologies to limb-count-challenged canines).
Start-up of the task, from the START command to the "server is
listening..." message took over an hour, and
STOP-command-to-task-termination took approx. 30 minutes.
(SSH-ing to the task was a bit of a joke, too. Responses to simple commands
like 'docker ps -a' could be seen 'painting' across the screen,
character-by-character...)
As a result, I've moved away from trying to limit the task for the time
being. I'm concentrating on attempting to get cadvisor to be a bit less
greedy.

Regards
Sean

On Wed, 26 Aug 2020 at 13:49, Michael Babcock <bigironp...@gmail.com> wrote:

> I can’t check my zCX out right now since my internet is down.
>
> You are running these on zIIP engines correct? Must be nice to have 5
> zIIPs!  And have the WLM parts in place?   Although it probably wouldn’t
> make much difference during startup/shutdown.
>
> On Wed, Aug 26, 2020 at 3:40 AM Sean Gleann <sean.gle...@gmail.com> wrote:
>
> > Can anyone offer advice, please, with regard to monitoring the system
> >
> > resource consumption of a zcx Container task?
> >
> >
> >
> > I've got a zcx Container task running on a 'sandbox' system where - as
> yet
> >
> > - I'm not collecting any RMF/SMF data. Because of that, my only source of
> >
> > system usage is the SDSF DA panel. I feel that the numbers I see there
> >
> > are... 'questionable' is the best word I can think of.
> >
> >
> >
> > Firstly, the EXCP-count for the task goes up to about 15360 during the
> >
> > initial start-up phase, but then it stays there until the STOP command is
> >
> > issued. At that point, EXCP-count starts rising again, until the task
> >
> > finally terminates. The explanation for that is probably because all the
> >
> > I/O is being handled internally at the 'Linux' level - the task must be
> >
> > doing *some* I/O, right? - but the data isn't getting back to SDSF for
> some
> >
> > reason. Without the benefit of SMF data to examine, I'm wondering if this
> >
> > is part of a larger problem.
> >
> >
> >
> > The other thing that troubles me is the CPU% busy value. My sandbox
> system
> >
> > has 5 engines defined, and in the 'start.json' file that controls the zcx
> >
> > Container task, I've specified a 'cpu' value of 4. During the start-up
> >
> > phase for the Container started task, SDSF shows CPU% values of approx
> 80%,
> >
> > but when the task is finally initialised, this drops to 'tickover' rates
> of
> >
> > about 1%. I'm happy with that - the initial start-up of *any* task as
> >
> > complex as a zcx Container is likely to cause high CPU usage, and the
> >
> > subsequent drop to the 1% levels is fine by me.
> >
> >
> >
> > But... Once the Container task is started and I've ssh'd into it, I then
> >
> > want to monitor its 'internal' system consumption. I've been using the
> >
> > 'Getting Started...' redbook as my guide throughout all this project, and
> >
> > it talks about using "Nodeexporter", "Cadvisor", "Prometheus" and
> "Grafana"
> >
> > as tools for this. I've got all those things installed and I can start
> and
> >
> > stop them quite happily, but I've found that using Cadvisor on it's own
> can
> >
> > drive CPU% levels back up to 80% for the entire time it is running. If a
> >
> > system is running flat-out when all it is doing is monitoring itself,
> well,
> >
> > there's something wrong somewhere... I'm trying to find an idiot's guide
> to
> >
> > controlling what Cadvisor does, but as yet I've been unsuccessful.
> >
> >
> >
> > Regards
> >
> > Sean
> >
> >
> >
> > ----------------------------------------------------------------------
> >
> > For IBM-MAIN subscribe / signoff / archive access instructions,
> >
> > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
> >
> > --
> Michael Babcock
> OneMain Financial
> z/OS Systems Programmer, Lead
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: ZCX task monitoring, anyone?

Reply via email to