Hello,
I hope someone can share the experience or put some light on the problem. I will refer to elasticsearch as "ES" in this email. We are running 2 Openshift clusters on 1 zVM LPAR (16 logical CPU , SMT2). Cluster 1 (development workload): 3x Master node (each zLinux 8 vCPU) (VSWITCH-1 VLAN 1) 3x Worker node (each 10 vCPU) (VSWITCH-1 VLAN 1) 1x Infra node (6 vCPU) (VSWITCH-1 VLAN 1) ("ES" ON) (high CPU use) Cluser 2 (no workload, just "ES" ON): 3x Master node (each zLinux 4 vCPU) (VSWITCH-2 VLAN 2) 4x Worker node (each 4 vCPU) (VSWITCH-2 VLAN 2) 2x Infra node (6 vCPU) (VSWITCH-2 VLAN 2) ("ES" on on each) (high CPU use) Problem: With "ES" OFF on both clusters, the batch time of APP1 is ~600 seconds. With "ES" ON on both clusters, batch time is ~1200 seconds. Sympthoms: - high cpu steal on zLinux nodes (TOP) with elasticsearch active - bad network response (git clone, downloading images) - CPU steal drops if we shutdown elasticsearch With "ES" ON: zVM perfkit LPAR CPU at ~60% . CEC IFL usage 40%. Where do you expect the bottleneck and what is causing high CPU steal on zLinux nodes ? Some more info - there is Fluentd pod running on every cluster node and is sending log data constantly (quite big amounts) to Infra node (elasticsearch) IBM gave a tip that, CPU steal is accounted to zLinux when VSWITCH is processing network requests for this zLinux. If so, how can we solve this ? - run guests on Direct Attached OSA ? - split Nodes to different VSWITCHES ? (currently all nodes + DB running on 1 vswitch same VLAN) - ? The application we are using for testing, is split into 6 microservices (processes). PROC1 is used to read file and insert data to DB. I saw that CPU time (top) of PROC1 accounted when file is processed = 10 sec, but the real time we have to wait to see this work done is from 60-300 seconds (depending if ES is running) So far everyone is just saying "you need more IFLs". But why do I need more IFLs if I'm using 40% of CEC IFL capacity ? Thanks you, Mariusz ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390