Re: performance problems db2 after moving from AIX
> Where can I look for potential relief? Everyone was hoping for a better > performance not worse.I am hoping that there is something we can tweak to > make this better. Only because I didn't see it specifically in the thread yet, do you have similar large page size support/tuning in both environments? -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Can you provide the zVM SRM parameters and MT status? Regards, Phil Tully Chief Architect-z/VM & z/Linux phil.tu...@adp.com cell: 973-202-7427 1-800 377-0237,,5252697# On 11/3/20, 12:25 PM, "Linux on 390 Port on behalf of Grzegorz Powiedziuk" wrote: WARNING: Do not click links or open attachments unless you recognize the source of the email and know the contents are safe. ** Hi Jim, correction - we have z14 not z114 .. .not sure why I keep calling our z14 z114 ;) We have z14 We have 16 IFLs in total shared across 5 z/VM lparps but really there is literally nothing running in there yet beside this one huge VM which has 10 IFLs configured. We have plenty of spare memory left and this one VM has 150G configured. It is about the same as what they had for this database when it was on the AIX. This number has been calculated by DBAs and it seems ok. I am not sure how to tell if DB2 is happy with what it has or not, but the linux OS is definitely not starving for memory. that p7 was 9117-MMD. And I just found that it was had EC set to 10 but it could pull up to 15 processors. I am not sure how that works over there. On Tue, Nov 3, 2020 at 10:58 AM Jim Elliott wrote: > Gregory: > > Do you have a z114 with 10 IFLs? That is the maximum number of IFLs > available on a z114 (2818-M10) and would be unusual. Is this a single z/VM > LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was the > specific MT/Model for the P7 box? > > If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 the > Power system has 1.4 to 2.0 times the capacity of the z114. > > Jim Elliott > Senior IT Consultant - GlassHouse Systems Inc. > > > On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk > wrote: > > > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 > on > > Z and we are having big performance issues. > > Same memory, CPU number is down from 12 to 10. Although they had > > multithreading ON so they saw more "cpus" We have faster disks (moved to > > flash), faster FCP cards and faster network adapters. > > We are running on z114 and at this point that is practically the only VM > > running with IFLs on this box. > > > > It seems that when "jobs" run on their own, they finish faster than what > > they were getting on AIX. > > But problems start if there is more than we can chew. So either few jobs > > running at the same time or some reorgs running in the database. > > > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > > 20-30% ) but no iowaits. > > Plenty of memory available. > > At this point everything becomes extremely slow, people are starting > having > > problems with connecting to db2 (annd sshing), basically it becomes a > > nightmare > > > > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes > > running on the same host). We moved it like this 1:1 from that old AIX. > > > > DB2 is running on the ext4 filesystem (Actually a huge number of > > filesystems- each NODE is a separate logical volume). Separate for logs, > > data. > > > > If this continues like this, we will add 2 cpus but I have a feeling that > > it will not make much difference. > > > > I know that we end up with a massive number of processes and a massive > > number of file descriptors (lsof sice it shows also threads now, is > > practically useless - it would run for way too long - 10-30 minutes > > probably) . > > > > A snapshot from just now: > > > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, > 151.07, > > 133.54 > > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > > 2.9 st > > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > 0.6 st > > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.3 st > > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.3 st > > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > 0.6 st > > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > 0.3 st > > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > 0.6 st > > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > 0.6 st > > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > 0.6 st > > %Cpu9 : 84.1
Re: performance problems db2 after moving from AIX
On Tue, 3 Nov 2020 at 21:27, Grzegorz Powiedziuk wrote: > > In the performance monitor toolkit it shows around 12.000 diag x'9c' /s > and 50 x'44' > But at this time of a day everything is calm. i will check again tomorrow. > Lot's of diag x'9c' would indicate too many virtual cpus right? > You would expect the amount of spinning to go down when you vary some CPUs off from the Linux side. Since you seem to be running just this single massive guest, there's less concern about adjusting SHARE settings along with it. It's not entirely clear from your description how identical this is to the AIX configuration. If you doubled the number of virtual CPUs for the guest because of SMT-2, with that many virtual CPUs the application locking may turn out to make that counter-productive. There's no doubt a lot of wisdom in monitor data from the workload to explain the z/VM side. As Christian suggests, you would do wise to take advantage of your support arrangement and have a z/VM performance analyst look into it. Rob -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Diag 9C are low cost, Diag 44 not so much. 50 is a low number. On 11/3/2020 12:27 PM, Grzegorz Powiedziuk wrote: On Tue, Nov 3, 2020 at 1:58 PM Grzegorz Powiedziuk wrote: Thanks Christian. There is no pagging (swapping) here besides just regular kernel's house keeping (vm.swappiness =5 ) rhel 7 doesn't give me diag_stat in the debug filesystem hmm On Tue, Nov 3, 2020 at 12:37 PM Christian Borntraeger < borntrae...@linux.ibm.com> wrote: So you at least had some time where you paged out memory. If you have sysstat installed it would be good to get some history data of cpu and swap. You can also run "vmstat 1 -w" to get an online you on the system load. Can you also check (as root) /sys/kernel/debug/diag_stat 2 times and see if you see excessive diagnose 9c rates. In the performance monitor toolkit it shows around 12.000 diag x'9c' /s and 50 x'44' But at this time of a day everything is calm. i will check again tomorrow. Lot's of diag x'9c' would indicate too many virtual cpus right? -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
On Tue, Nov 3, 2020 at 3:25 PM Jim Elliott wrote: > Gregory, > > Yes, thrashing. :-) > > I like my name better and it even fits better :) I will keep an eye on page faults tomorrow but we are not overcommitting memory at all. Unless something inside of db2 is cooking but in linux there is no swapping and in z/vm no paging whatsoever. DB2 grabs about 95% of the memory and it all goes into "shmem" and uses that for it's buffers and stuff. It's not like db2 has its own internal paging? Even if it did, I am sure DBAs would scream by now. Although the ~20% cpu time spent in kernel mode is something I've been questioning (rest of it goes straight into user time) . But I have no clue if it is a lot for this type of workload or not at all. I've been blaming a massive number of filesystems/logical volumes (152) and huge number of threads processes and FD's. How do you best determine that there is no some other thrashing? thanks! Gregory > Jim Elliott > Senior IT Consultant - GlassHouse Systems Inc. > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
On Tue, Nov 3, 2020 at 1:58 PM Grzegorz Powiedziuk wrote: > Thanks Christian. > There is no pagging (swapping) here besides just regular kernel's house > keeping (vm.swappiness =5 ) > rhel 7 doesn't give me diag_stat in the debug filesystem hmm > > On Tue, Nov 3, 2020 at 12:37 PM Christian Borntraeger < > borntrae...@linux.ibm.com> wrote: > >> >> So you at least had some time where you paged out memory. >> If you have sysstat installed it would be good to get some history data of >> cpu and swap. >> >> You can also run "vmstat 1 -w" to get an online you on the system load. >> Can you also check (as root) >> /sys/kernel/debug/diag_stat >> 2 times and see if you see excessive diagnose 9c rates. >> >> In the performance monitor toolkit it shows around 12.000 diag x'9c' /s and 50 x'44' But at this time of a day everything is calm. i will check again tomorrow. Lot's of diag x'9c' would indicate too many virtual cpus right? -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Gregory, Yes, thrashing. :-) Jim Elliott Senior IT Consultant - GlassHouse Systems Inc. On Tue, Nov 3, 2020 at 2:07 PM Grzegorz Powiedziuk wrote: > On Tue, Nov 3, 2020 at 1:57 PM Jim Elliott wrote: > > > Gregory, > > > > The 9117-MMD could range from 1 chip/4 cores all the way up to 16 > chips/64 > > cores at either 3.80 or 4.22 GHz. If it has 15 cores, then it was likely > > the 4.22 GHz 5 chip/15 core version. Using 10 out of 15 cores (even at > 100% > > busy) should fit on 5 z14 ZR1 or z14 M0x IFLs. Sounds like there is > > something causing thrashing. Do you have a z/VM performance producs > > (Velocity or IBM?) as that might help isolate where the bottleneck is. > > > > > > Thanks Jim, this is encouraging. We do have a performance monitor toolkit > and I've been running sar in here as well. > When you say trashing, do you mean memory thrashing? > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www2.marist.edu/htbin/wlvindex?LINUX-390 > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
On Tue, Nov 3, 2020 at 1:57 PM Jim Elliott wrote: > Gregory, > > The 9117-MMD could range from 1 chip/4 cores all the way up to 16 chips/64 > cores at either 3.80 or 4.22 GHz. If it has 15 cores, then it was likely > the 4.22 GHz 5 chip/15 core version. Using 10 out of 15 cores (even at 100% > busy) should fit on 5 z14 ZR1 or z14 M0x IFLs. Sounds like there is > something causing thrashing. Do you have a z/VM performance producs > (Velocity or IBM?) as that might help isolate where the bottleneck is. > > > Thanks Jim, this is encouraging. We do have a performance monitor toolkit and I've been running sar in here as well. When you say trashing, do you mean memory thrashing? -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
On Tue, Nov 3, 2020 at 1:35 PM r.stricklin wrote: > > I recently had a vaguely similar problem with a much smaller database on > linux (x86, mysql for zabbix) that presented bizarre performance issues > despite clearly having lots of resources left available. > > What our problem ended up being was the linux block i/o scheduler deciding > to defer i/o based on seek avoidance, and under heavy database use this was > causing havoc with mysql's ability to complete transactions. It seems > absurd to preferentially avoid seeks when you have SSD. The problems > vanished instantly when we changed the i/o scheduler on the SSD block > devices to 'noop'. > > I think it's worth checking in your case. > > OH! this is a great idea. I've completely forgot about this. We are using a default deadline and noop could save some cpu cycles!! thank you! I will definitely consider changing the scheduler -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Thanks Christian. There is no pagging (swapping) here besides just regular kernel's house keeping (vm.swappiness =5 ) rhel 7 doesn't give me diag_stat in the debug filesystem hmm On Tue, Nov 3, 2020 at 12:37 PM Christian Borntraeger < borntrae...@linux.ibm.com> wrote: > On 03.11.20 14:46, Grzegorz Powiedziuk wrote: > > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 > on > > Z and we are having big performance issues. > > Same memory, CPU number is down from 12 to 10. Although they had > > multithreading ON so they saw more "cpus" We have faster disks (moved to > > flash), faster FCP cards and faster network adapters. > > We are running on z114 and at this point that is practically the only VM > > running with IFLs on this box. > > > > It seems that when "jobs" run on their own, they finish faster than what > > they were getting on AIX. > > But problems start if there is more than we can chew. So either few jobs > > running at the same time or some reorgs running in the database. > > > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > > 20-30% ) but no iowaits. > > Plenty of memory available. > > At this point everything becomes extremely slow, people are starting > having > > problems with connecting to db2 (annd sshing), basically it becomes a > > nightmare > > > > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes > > running on the same host). We moved it like this 1:1 from that old AIX. > > > > DB2 is running on the ext4 filesystem (Actually a huge number of > > filesystems- each NODE is a separate logical volume). Separate for logs, > > data. > > > > If this continues like this, we will add 2 cpus but I have a feeling that > > it will not make much difference. > > > > I know that we end up with a massive number of processes and a massive > > number of file descriptors (lsof sice it shows also threads now, is > > practically useless - it would run for way too long - 10-30 minutes > > probably) . > > > > A snapshot from just now: > > > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, > 151.07, > > 133.54 > > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > > 2.9 st > > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > 0.6 st > > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.3 st > > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.3 st > > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > 0.6 st > > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > 0.3 st > > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > 0.6 st > > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > 0.6 st > > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > 0.6 st > > %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.6 st > > KiB Mem : 15424256+total, 1069280 free, 18452168 used, > 13472112+buff/cache > > KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail > Mem > > So you at least had some time where you paged out memory. > If you have sysstat installed it would be good to get some history data of > cpu and swap. > > You can also run "vmstat 1 -w" to get an online you on the system load. > Can you also check (as root) > /sys/kernel/debug/diag_stat > 2 times and see if you see excessive diagnose 9c rates. > > > > > Where can I look for potential relief? Everyone was hoping for a better > > performance not worse.I am hoping that there is something we can tweak to > > make this better. > > I will appreciate any ideas! > > I agree this should have gotten faster, not slower. > > If you have an IBM service contract (or any other vendor that provides > support) > you could open a service ticket to get this analysed. > > Christian > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www2.marist.edu/htbin/wlvindex?LINUX-390 > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Gregory, The 9117-MMD could range from 1 chip/4 cores all the way up to 16 chips/64 cores at either 3.80 or 4.22 GHz. If it has 15 cores, then it was likely the 4.22 GHz 5 chip/15 core version. Using 10 out of 15 cores (even at 100% busy) should fit on 5 z14 ZR1 or z14 M0x IFLs. Sounds like there is something causing thrashing. Do you have a z/VM performance producs (Velocity or IBM?) as that might help isolate where the bottleneck is. Jim Elliott Senior IT Consultant - GlassHouse Systems Inc. On Tue, Nov 3, 2020 at 12:25 PM Grzegorz Powiedziuk wrote: > Hi Jim, > correction - we have z14 not z114 > .. .not sure why I keep calling our z14 z114 ;) We have z14 > > We have 16 IFLs in total shared across 5 z/VM lparps but really there is > literally nothing running in there yet beside this one huge VM which has 10 > IFLs configured. We have plenty of spare memory left and this one VM has > 150G configured. It is about the same as what they had for this database > when it was on the AIX. This number has been calculated by DBAs and it > seems ok. I am not sure how to tell if DB2 is happy with what it has or > not, but the linux OS is definitely not starving for memory. > > that p7 was 9117-MMD. And I just found that it was had EC set to 10 but it > could pull up to 15 processors. I am not sure how that works over there. > > > > On Tue, Nov 3, 2020 at 10:58 AM Jim Elliott wrote: > > > Gregory: > > > > Do you have a z114 with 10 IFLs? That is the maximum number of IFLs > > available on a z114 (2818-M10) and would be unusual. Is this a single > z/VM > > LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was > the > > specific MT/Model for the P7 box? > > > > If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 > the > > Power system has 1.4 to 2.0 times the capacity of the z114. > > > > Jim Elliott > > Senior IT Consultant - GlassHouse Systems Inc. > > > > > > On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk < > gpowiedz...@gmail.com> > > wrote: > > > > > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to > rhel7 > > on > > > Z and we are having big performance issues. > > > Same memory, CPU number is down from 12 to 10. Although they had > > > multithreading ON so they saw more "cpus" We have faster disks (moved > to > > > flash), faster FCP cards and faster network adapters. > > > We are running on z114 and at this point that is practically the only > VM > > > running with IFLs on this box. > > > > > > It seems that when "jobs" run on their own, they finish faster than > what > > > they were getting on AIX. > > > But problems start if there is more than we can chew. So either few > jobs > > > running at the same time or some reorgs running in the database. > > > > > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > > > 20-30% ) but no iowaits. > > > Plenty of memory available. > > > At this point everything becomes extremely slow, people are starting > > having > > > problems with connecting to db2 (annd sshing), basically it becomes a > > > nightmare > > > > > > This db2 is massive (30+TB) and it is a multinode configuration (17 > nodes > > > running on the same host). We moved it like this 1:1 from that old AIX. > > > > > > DB2 is running on the ext4 filesystem (Actually a huge number of > > > filesystems- each NODE is a separate logical volume). Separate for > logs, > > > data. > > > > > > If this continues like this, we will add 2 cpus but I have a feeling > that > > > it will not make much difference. > > > > > > I know that we end up with a massive number of processes and a massive > > > number of file descriptors (lsof sice it shows also threads now, is > > > practically useless - it would run for way too long - 10-30 minutes > > > probably) . > > > > > > A snapshot from just now: > > > > > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, > > 151.07, > > > 133.54 > > > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > > > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > > > 2.9 st > > > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > > 0.6 st > > > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > > 0.3 st > > > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > > 0.3 st > > > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > > 0.6 st > > > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > > 0.3 st > > > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > > 0.6 st > > > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > > 0.6 st > > > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > > 0.6 st > > > %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > > 0.6 st > > > KiB Mem : 15424256+total, 1069280 free, 18452168
Re: performance problems db2 after moving from AIX
On Nov 3, 2020, at 5:46 AM, Grzegorz Powiedziuk wrote: > DB2 is running on the ext4 filesystem (Actually a huge number of > filesystems- each NODE is a separate logical volume). Separate for logs, > data. I recently had a vaguely similar problem with a much smaller database on linux (x86, mysql for zabbix) that presented bizarre performance issues despite clearly having lots of resources left available. What our problem ended up being was the linux block i/o scheduler deciding to defer i/o based on seek avoidance, and under heavy database use this was causing havoc with mysql's ability to complete transactions. It seems absurd to preferentially avoid seeks when you have SSD. The problems vanished instantly when we changed the i/o scheduler on the SSD block devices to 'noop'. I think it's worth checking in your case. ok bear. -- until further notice -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Can you provide the output of Q MULTITHREAD and Q SRM Regards, Phil Tully Chief Architect-z/VM & z/Linux phil.tu...@adp.com cell: 973-202-7427 1-800 377-0237,,5252697# On 11/3/20, 8:47 AM, "Linux on 390 Port on behalf of Grzegorz Powiedziuk" wrote: WARNING: Do not click links or open attachments unless you recognize the source of the email and know the contents are safe. ** Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on Z and we are having big performance issues. Same memory, CPU number is down from 12 to 10. Although they had multithreading ON so they saw more "cpus" We have faster disks (moved to flash), faster FCP cards and faster network adapters. We are running on z114 and at this point that is practically the only VM running with IFLs on this box. It seems that when "jobs" run on their own, they finish faster than what they were getting on AIX. But problems start if there is more than we can chew. So either few jobs running at the same time or some reorgs running in the database. Load average goes to 150-200, cpus are at 100% (kernel time can go to 20-30% ) but no iowaits. Plenty of memory available. At this point everything becomes extremely slow, people are starting having problems with connecting to db2 (annd sshing), basically it becomes a nightmare This db2 is massive (30+TB) and it is a multinode configuration (17 nodes running on the same host). We moved it like this 1:1 from that old AIX. DB2 is running on the ext4 filesystem (Actually a huge number of filesystems- each NODE is a separate logical volume). Separate for logs, data. If this continues like this, we will add 2 cpus but I have a feeling that it will not make much difference. I know that we end up with a massive number of processes and a massive number of file descriptors (lsof sice it shows also threads now, is practically useless - it would run for way too long - 10-30 minutes probably) . A snapshot from just now: top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, 151.07, 133.54 Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, 2.9 st %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, 0.6 st %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, 0.3 st %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, 0.3 st %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, 0.6 st %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, 0.3 st %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, 0.6 st %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, 0.6 st %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, 0.6 st %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, 0.6 st KiB Mem : 15424256+total, 1069280 free, 18452168 used, 13472112+buff/cache KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail Mem Where can I look for potential relief? Everyone was hoping for a better performance not worse.I am hoping that there is something we can tweak to make this better. I will appreciate any ideas! thanks Gregory -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit https://urldefense.proofpoint.com/v2/url?u=http-3A__www2.marist.edu_htbin_wlvindex-3FLINUX-2D390=DwIBaQ=xu_5lAfKHjInGFR3ndoZrw=YINhRQBnBpDTaiDeBIX-uouJy__emITEWU-E34BcxvU=SUeq7nloampLUlcudHLlDllRMQk8sapBSXhim0FFfbU=O5QoJ1ORESabvMAsvoNl3rqMG5HuqEdReIffDAH3p2E= -- This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
Re: performance problems db2 after moving from AIX
On 03.11.20 14:46, Grzegorz Powiedziuk wrote: > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on > Z and we are having big performance issues. > Same memory, CPU number is down from 12 to 10. Although they had > multithreading ON so they saw more "cpus" We have faster disks (moved to > flash), faster FCP cards and faster network adapters. > We are running on z114 and at this point that is practically the only VM > running with IFLs on this box. > > It seems that when "jobs" run on their own, they finish faster than what > they were getting on AIX. > But problems start if there is more than we can chew. So either few jobs > running at the same time or some reorgs running in the database. > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > 20-30% ) but no iowaits. > Plenty of memory available. > At this point everything becomes extremely slow, people are starting having > problems with connecting to db2 (annd sshing), basically it becomes a > nightmare > > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes > running on the same host). We moved it like this 1:1 from that old AIX. > > DB2 is running on the ext4 filesystem (Actually a huge number of > filesystems- each NODE is a separate logical volume). Separate for logs, > data. > > If this continues like this, we will add 2 cpus but I have a feeling that > it will not make much difference. > > I know that we end up with a massive number of processes and a massive > number of file descriptors (lsof sice it shows also threads now, is > practically useless - it would run for way too long - 10-30 minutes > probably) . > > A snapshot from just now: > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, 151.07, > 133.54 > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > 2.9 st > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > 0.6 st > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.3 st > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.3 st > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > 0.6 st > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > 0.3 st > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > 0.6 st > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > 0.6 st > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > 0.6 st > %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.6 st > KiB Mem : 15424256+total, 1069280 free, 18452168 used, 13472112+buff/cache > KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail Mem So you at least had some time where you paged out memory. If you have sysstat installed it would be good to get some history data of cpu and swap. You can also run "vmstat 1 -w" to get an online you on the system load. Can you also check (as root) /sys/kernel/debug/diag_stat 2 times and see if you see excessive diagnose 9c rates. > > Where can I look for potential relief? Everyone was hoping for a better > performance not worse.I am hoping that there is something we can tweak to > make this better. > I will appreciate any ideas! I agree this should have gotten faster, not slower. If you have an IBM service contract (or any other vendor that provides support) you could open a service ticket to get this analysed. Christian -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
Hi Jim, correction - we have z14 not z114 .. .not sure why I keep calling our z14 z114 ;) We have z14 We have 16 IFLs in total shared across 5 z/VM lparps but really there is literally nothing running in there yet beside this one huge VM which has 10 IFLs configured. We have plenty of spare memory left and this one VM has 150G configured. It is about the same as what they had for this database when it was on the AIX. This number has been calculated by DBAs and it seems ok. I am not sure how to tell if DB2 is happy with what it has or not, but the linux OS is definitely not starving for memory. that p7 was 9117-MMD. And I just found that it was had EC set to 10 but it could pull up to 15 processors. I am not sure how that works over there. On Tue, Nov 3, 2020 at 10:58 AM Jim Elliott wrote: > Gregory: > > Do you have a z114 with 10 IFLs? That is the maximum number of IFLs > available on a z114 (2818-M10) and would be unusual. Is this a single z/VM > LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was the > specific MT/Model for the P7 box? > > If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 the > Power system has 1.4 to 2.0 times the capacity of the z114. > > Jim Elliott > Senior IT Consultant - GlassHouse Systems Inc. > > > On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk > wrote: > > > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 > on > > Z and we are having big performance issues. > > Same memory, CPU number is down from 12 to 10. Although they had > > multithreading ON so they saw more "cpus" We have faster disks (moved to > > flash), faster FCP cards and faster network adapters. > > We are running on z114 and at this point that is practically the only VM > > running with IFLs on this box. > > > > It seems that when "jobs" run on their own, they finish faster than what > > they were getting on AIX. > > But problems start if there is more than we can chew. So either few jobs > > running at the same time or some reorgs running in the database. > > > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > > 20-30% ) but no iowaits. > > Plenty of memory available. > > At this point everything becomes extremely slow, people are starting > having > > problems with connecting to db2 (annd sshing), basically it becomes a > > nightmare > > > > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes > > running on the same host). We moved it like this 1:1 from that old AIX. > > > > DB2 is running on the ext4 filesystem (Actually a huge number of > > filesystems- each NODE is a separate logical volume). Separate for logs, > > data. > > > > If this continues like this, we will add 2 cpus but I have a feeling that > > it will not make much difference. > > > > I know that we end up with a massive number of processes and a massive > > number of file descriptors (lsof sice it shows also threads now, is > > practically useless - it would run for way too long - 10-30 minutes > > probably) . > > > > A snapshot from just now: > > > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, > 151.07, > > 133.54 > > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > > 2.9 st > > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > 0.6 st > > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.3 st > > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.3 st > > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > > 0.6 st > > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > 0.3 st > > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > 0.6 st > > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > > 0.6 st > > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > > 0.6 st > > %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > > 0.6 st > > KiB Mem : 15424256+total, 1069280 free, 18452168 used, > 13472112+buff/cache > > KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail > Mem > > > > Where can I look for potential relief? Everyone was hoping for a better > > performance not worse.I am hoping that there is something we can tweak to > > make this better. > > I will appreciate any ideas! > > thanks > > Gregory > > > > -- > > For LINUX-390 subscribe / signoff / archive access instructions, > > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > > visit > > http://www2.marist.edu/htbin/wlvindex?LINUX-390 > > > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to
Re: performance problems db2 after moving from AIX
Gregory: Do you have a z114 with 10 IFLs? That is the maximum number of IFLs available on a z114 (2818-M10) and would be unusual. Is this a single z/VM LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was the specific MT/Model for the P7 box? If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 the Power system has 1.4 to 2.0 times the capacity of the z114. Jim Elliott Senior IT Consultant - GlassHouse Systems Inc. On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk wrote: > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on > Z and we are having big performance issues. > Same memory, CPU number is down from 12 to 10. Although they had > multithreading ON so they saw more "cpus" We have faster disks (moved to > flash), faster FCP cards and faster network adapters. > We are running on z114 and at this point that is practically the only VM > running with IFLs on this box. > > It seems that when "jobs" run on their own, they finish faster than what > they were getting on AIX. > But problems start if there is more than we can chew. So either few jobs > running at the same time or some reorgs running in the database. > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > 20-30% ) but no iowaits. > Plenty of memory available. > At this point everything becomes extremely slow, people are starting having > problems with connecting to db2 (annd sshing), basically it becomes a > nightmare > > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes > running on the same host). We moved it like this 1:1 from that old AIX. > > DB2 is running on the ext4 filesystem (Actually a huge number of > filesystems- each NODE is a separate logical volume). Separate for logs, > data. > > If this continues like this, we will add 2 cpus but I have a feeling that > it will not make much difference. > > I know that we end up with a massive number of processes and a massive > number of file descriptors (lsof sice it shows also threads now, is > practically useless - it would run for way too long - 10-30 minutes > probably) . > > A snapshot from just now: > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, 151.07, > 133.54 > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > 2.9 st > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > 0.6 st > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.3 st > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.3 st > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > 0.6 st > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > 0.3 st > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > 0.6 st > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > 0.6 st > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > 0.6 st > %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.6 st > KiB Mem : 15424256+total, 1069280 free, 18452168 used, 13472112+buff/cache > KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail Mem > > Where can I look for potential relief? Everyone was hoping for a better > performance not worse.I am hoping that there is something we can tweak to > make this better. > I will appreciate any ideas! > thanks > Gregory > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www2.marist.edu/htbin/wlvindex?LINUX-390 > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
Re: performance problems db2 after moving from AIX
you've got a gig of swap used , and you said %system CPU time is way higher while the system becomes unusable ? are you actively swapping during that time when the system is not responsive? If yes you need to try to either add memory or reduce the memory demand on the system. On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk wrote: > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on > Z and we are having big performance issues. > Same memory, CPU number is down from 12 to 10. Although they had > multithreading ON so they saw more "cpus" We have faster disks (moved to > flash), faster FCP cards and faster network adapters. > We are running on z114 and at this point that is practically the only VM > running with IFLs on this box. > > It seems that when "jobs" run on their own, they finish faster than what > they were getting on AIX. > But problems start if there is more than we can chew. So either few jobs > running at the same time or some reorgs running in the database. > > Load average goes to 150-200, cpus are at 100% (kernel time can go to > 20-30% ) but no iowaits. > Plenty of memory available. > At this point everything becomes extremely slow, people are starting having > problems with connecting to db2 (annd sshing), basically it becomes a > nightmare > > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes > running on the same host). We moved it like this 1:1 from that old AIX. > > DB2 is running on the ext4 filesystem (Actually a huge number of > filesystems- each NODE is a separate logical volume). Separate for logs, > data. > > If this continues like this, we will add 2 cpus but I have a feeling that > it will not make much difference. > > I know that we end up with a massive number of processes and a massive > number of file descriptors (lsof sice it shows also threads now, is > practically useless - it would run for way too long - 10-30 minutes > probably) . > > A snapshot from just now: > > top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, 151.07, > 133.54 > Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie > %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, > 2.9 st > %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > 0.6 st > %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.3 st > %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.3 st > %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, > 0.6 st > %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > 0.3 st > %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > 0.6 st > %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, > 0.6 st > %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, > 0.6 st > %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, > 0.6 st > KiB Mem : 15424256+total, 1069280 free, 18452168 used, 13472112+buff/cache > KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail Mem > > Where can I look for potential relief? Everyone was hoping for a better > performance not worse.I am hoping that there is something we can tweak to > make this better. > I will appreciate any ideas! > thanks > Gregory > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www2.marist.edu/htbin/wlvindex?LINUX-390 > -- Jay Brenneman -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390
performance problems db2 after moving from AIX
Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on Z and we are having big performance issues. Same memory, CPU number is down from 12 to 10. Although they had multithreading ON so they saw more "cpus" We have faster disks (moved to flash), faster FCP cards and faster network adapters. We are running on z114 and at this point that is practically the only VM running with IFLs on this box. It seems that when "jobs" run on their own, they finish faster than what they were getting on AIX. But problems start if there is more than we can chew. So either few jobs running at the same time or some reorgs running in the database. Load average goes to 150-200, cpus are at 100% (kernel time can go to 20-30% ) but no iowaits. Plenty of memory available. At this point everything becomes extremely slow, people are starting having problems with connecting to db2 (annd sshing), basically it becomes a nightmare This db2 is massive (30+TB) and it is a multinode configuration (17 nodes running on the same host). We moved it like this 1:1 from that old AIX. DB2 is running on the ext4 filesystem (Actually a huge number of filesystems- each NODE is a separate logical volume). Separate for logs, data. If this continues like this, we will add 2 cpus but I have a feeling that it will not make much difference. I know that we end up with a massive number of processes and a massive number of file descriptors (lsof sice it shows also threads now, is practically useless - it would run for way too long - 10-30 minutes probably) . A snapshot from just now: top - 08:37:50 up 11 days, 12:04, 28 users, load average: 188.29, 151.07, 133.54 Tasks: 1843 total, 11 running, 1832 sleeping, 0 stopped, 0 zombie %Cpu0 : 76.3 us, 16.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 3.2 si, 2.9 st %Cpu1 : 66.1 us, 31.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, 0.6 st %Cpu2 : 66.9 us, 31.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, 0.3 st %Cpu3 : 74.7 us, 23.4 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, 0.3 st %Cpu4 : 86.7 us, 10.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.3 si, 0.6 st %Cpu5 : 83.8 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, 0.3 st %Cpu6 : 81.6 us, 15.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, 0.6 st %Cpu7 : 70.6 us, 26.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.9 si, 0.6 st %Cpu8 : 70.5 us, 26.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.6 hi, 1.6 si, 0.6 st %Cpu9 : 84.1 us, 13.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 1.3 si, 0.6 st KiB Mem : 15424256+total, 1069280 free, 18452168 used, 13472112+buff/cache KiB Swap: 52305904 total, 51231216 free, 1074688 used. 17399028 avail Mem Where can I look for potential relief? Everyone was hoping for a better performance not worse.I am hoping that there is something we can tweak to make this better. I will appreciate any ideas! thanks Gregory -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www2.marist.edu/htbin/wlvindex?LINUX-390