Re: performance problems db2 after moving from AIX

2020-11-03 Thread Eric Covener
> Where  can I look for potential relief? Everyone was hoping for a better
> performance not worse.I am hoping that there is something we can tweak to
> make this better.

Only because I didn't see it specifically in the thread yet, do you have
similar large page size support/tuning in both environments?

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Tully, Phil (CORP)
Can you provide the zVM SRM parameters  and MT status?

Regards, Phil Tully
Chief Architect-z/VM & z/Linux
phil.tu...@adp.com
cell: 973-202-7427
1-800 377-0237,,5252697#



On 11/3/20, 12:25 PM, "Linux on 390 Port on behalf of Grzegorz Powiedziuk" 
 wrote:


WARNING: Do not click links or open attachments unless you recognize the 
source of the email and know the contents are safe. 

**
Hi Jim,

correction - we have z14 not z114

 .. .not sure why I keep calling our z14 z114 ;)  We have z14



We have 16 IFLs in total shared across 5 z/VM lparps but really there is

literally nothing running in there yet beside this one huge VM which has 10

IFLs configured. We have plenty of spare memory left and this one VM has

150G configured. It is about the same as what they had for this database

when it was on the AIX. This number has been calculated by DBAs and it

seems ok. I am not sure how to tell if DB2 is happy with what it has or

not, but the linux OS is definitely not starving for memory.



that p7 was 9117-MMD. And I just found that it was had EC set to 10 but it

could pull up to 15 processors. I am not sure how that works over there.







On Tue, Nov 3, 2020 at 10:58 AM Jim Elliott  wrote:



> Gregory:

>

> Do you have a z114 with 10 IFLs? That is the maximum number of IFLs

> available on a z114 (2818-M10) and would be unusual. Is this a single z/VM

> LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was 
the

> specific MT/Model for the P7 box?

>

> If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 the

> Power system has 1.4 to 2.0 times the capacity of the z114.

>

> Jim Elliott

> Senior IT Consultant - GlassHouse Systems Inc.

>

>

> On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk 

> wrote:

>

> > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7

> on

> > Z and we are having big performance issues.

> > Same memory, CPU number is down from 12 to 10.  Although they had

> > multithreading ON so they saw more "cpus" We have faster disks (moved to

> > flash), faster FCP cards and faster network adapters.

> > We are running on z114 and at this point that is practically the only VM

> > running with IFLs on this box.

> >

> > It seems that when "jobs" run on their own, they finish faster than what

> > they were getting on AIX.

> > But problems start if there is more than we can chew. So either few jobs

> > running at the same time or some reorgs running in the database.

> >

> > Load average goes to 150-200, cpus are at 100%  (kernel time can go to

> > 20-30% ) but no iowaits.

> > Plenty of memory available.

> > At this point everything becomes extremely slow, people are starting

> having

> > problems with connecting to db2 (annd sshing), basically it becomes a

> > nightmare

> >

> > This db2 is massive (30+TB) and it is a multinode configuration (17 
nodes

> > running on the same host). We moved it like this 1:1 from that old AIX.

> >

> > DB2 is running on the ext4 filesystem (Actually a huge number of

> > filesystems- each NODE is a separate logical volume). Separate for logs,

> > data.

> >

> > If this continues like this, we will add 2 cpus but I have a feeling 
that

> > it will not make much difference.

> >

> > I know that we end up with a massive number of processes and a massive

> > number of file descriptors (lsof sice it shows also threads now, is

> > practically useless - it would run for way too long - 10-30 minutes

> > probably) .

> >

> > A snapshot from just now:

> >

> > top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29,

> 151.07,

> > 133.54

> > Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie

> > %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,

> >  2.9 st

> > %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,

> >  0.6 st

> > %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,

> >  0.3 st

> > %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,

> >  0.3 st

> > %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,

> >  0.6 st

> > %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,

> >  0.3 st

> > %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,

> >  0.6 st

> > %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,

> >  0.6 st

> > %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,

> >  0.6 st

> > %Cpu9  : 84.1 

Re: performance problems db2 after moving from AIX

2020-11-03 Thread Rob van der Heij
On Tue, 3 Nov 2020 at 21:27, Grzegorz Powiedziuk 
wrote:

>
> In the performance monitor toolkit it shows around 12.000 diag x'9c' /s
> and 50 x'44'
> But at this time of a day everything is calm. i will check again tomorrow.
> Lot's of diag x'9c' would indicate too many virtual cpus right?
>

You would expect the amount of spinning to go down when you vary some CPUs
off from the Linux side. Since you seem to be running just this single
massive guest, there's less concern about adjusting SHARE settings along
with it.

It's not entirely clear from your description how identical this is to the
AIX configuration. If you doubled the number of virtual CPUs for the guest
because of SMT-2, with that many virtual CPUs the application locking may
turn out to make that counter-productive. There's no doubt a lot of wisdom
in monitor data from the workload to explain the z/VM side. As Christian
suggests, you would do wise to take advantage of your support arrangement
and have a z/VM performance analyst look into it.

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread barton

Diag 9C are low cost, Diag 44 not so much.  50 is a low number.


On 11/3/2020 12:27 PM, Grzegorz Powiedziuk wrote:

On Tue, Nov 3, 2020 at 1:58 PM Grzegorz Powiedziuk 
wrote:


Thanks Christian.
There is no pagging (swapping) here besides just regular kernel's house
keeping (vm.swappiness =5 )
rhel 7 doesn't give me diag_stat in the debug filesystem hmm

On Tue, Nov 3, 2020 at 12:37 PM Christian Borntraeger <
borntrae...@linux.ibm.com> wrote:


So you at least had some time where you paged out memory.
If you have sysstat installed it would be good to get some history data of
cpu and swap.

You can also run "vmstat 1 -w" to get an online you on the system load.
Can you also check (as root)
/sys/kernel/debug/diag_stat
2 times and see if you see excessive diagnose 9c rates.



In the performance monitor toolkit it shows around 12.000 diag x'9c' /s
and 50 x'44'
But at this time of a day everything is calm. i will check again tomorrow.
Lot's of diag x'9c' would indicate too many virtual cpus right?

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390




--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
On Tue, Nov 3, 2020 at 3:25 PM Jim Elliott  wrote:

> Gregory,
>
> Yes, thrashing. :-)
>
>
I like my name better and it even fits better :)
I will keep an eye on page faults tomorrow but we are not overcommitting
memory at all. Unless something inside of db2 is cooking but in linux there
is no swapping and in z/vm no paging whatsoever.
DB2 grabs about 95% of the memory and it all goes into "shmem" and uses
that for it's buffers and stuff. It's not like db2 has its own internal
paging? Even if it did, I am sure DBAs would scream by now.
Although the ~20% cpu time spent in kernel mode is something I've been
questioning (rest of it goes straight into user time) . But I have no clue
if it is a lot for this type of workload or not at all. I've been blaming a
massive number of filesystems/logical volumes (152) and huge number of
threads processes and FD's.
How do you best determine that there is no some other thrashing?
thanks!

Gregory


> Jim Elliott
> Senior IT Consultant - GlassHouse Systems Inc.
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
On Tue, Nov 3, 2020 at 1:58 PM Grzegorz Powiedziuk 
wrote:

> Thanks Christian.
> There is no pagging (swapping) here besides just regular kernel's house
> keeping (vm.swappiness =5 )
> rhel 7 doesn't give me diag_stat in the debug filesystem hmm
>
> On Tue, Nov 3, 2020 at 12:37 PM Christian Borntraeger <
> borntrae...@linux.ibm.com> wrote:
>
>>
>> So you at least had some time where you paged out memory.
>> If you have sysstat installed it would be good to get some history data of
>> cpu and swap.
>>
>> You can also run "vmstat 1 -w" to get an online you on the system load.
>> Can you also check (as root)
>> /sys/kernel/debug/diag_stat
>> 2 times and see if you see excessive diagnose 9c rates.
>>
>>
In the performance monitor toolkit it shows around 12.000 diag x'9c' /s
and 50 x'44'
But at this time of a day everything is calm. i will check again tomorrow.
Lot's of diag x'9c' would indicate too many virtual cpus right?

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Jim Elliott
Gregory,

Yes, thrashing. :-)

Jim Elliott
Senior IT Consultant - GlassHouse Systems Inc.


On Tue, Nov 3, 2020 at 2:07 PM Grzegorz Powiedziuk 
wrote:

> On Tue, Nov 3, 2020 at 1:57 PM Jim Elliott  wrote:
>
> > Gregory,
> >
> > The 9117-MMD could range from 1 chip/4 cores all the way up to 16
> chips/64
> > cores at either 3.80 or 4.22 GHz. If it has 15 cores, then it was likely
> > the 4.22 GHz 5 chip/15 core version. Using 10 out of 15 cores (even at
> 100%
> > busy) should fit on 5 z14 ZR1 or z14 M0x IFLs. Sounds like there is
> > something causing thrashing. Do you have a z/VM performance producs
> > (Velocity or IBM?) as that might help isolate where the bottleneck is.
> >
> >
> > Thanks Jim, this is encouraging. We do have a performance monitor toolkit
> and I've been running sar in here as well.
> When you say trashing, do you mean memory thrashing?
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
On Tue, Nov 3, 2020 at 1:57 PM Jim Elliott  wrote:

> Gregory,
>
> The 9117-MMD could range from 1 chip/4 cores all the way up to 16 chips/64
> cores at either 3.80 or 4.22 GHz. If it has 15 cores, then it was likely
> the 4.22 GHz 5 chip/15 core version. Using 10 out of 15 cores (even at 100%
> busy) should fit on 5 z14 ZR1 or z14 M0x IFLs. Sounds like there is
> something causing thrashing. Do you have a z/VM performance producs
> (Velocity or IBM?) as that might help isolate where the bottleneck is.
>
>
> Thanks Jim, this is encouraging. We do have a performance monitor toolkit
and I've been running sar in here as well.
When you say trashing, do you mean memory thrashing?

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
On Tue, Nov 3, 2020 at 1:35 PM r.stricklin  wrote:

>
> I recently had a vaguely similar problem with a much smaller database on
> linux (x86, mysql for zabbix) that presented bizarre performance issues
> despite clearly having lots of resources left available.
>
> What our problem ended up being was the linux block i/o scheduler deciding
> to defer i/o based on seek avoidance, and under heavy database use this was
> causing havoc with mysql's ability to complete transactions. It seems
> absurd to preferentially avoid seeks when you have SSD. The problems
> vanished instantly when we changed the i/o scheduler on the SSD block
> devices to 'noop'.
>
> I think it's worth checking in your case.
>
>
OH! this is a great idea. I've completely forgot about this. We are using a
default deadline and noop could save some cpu cycles!!
thank you! I will definitely consider changing the scheduler

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
Thanks Christian.
There is no pagging (swapping) here besides just regular kernel's house
keeping (vm.swappiness =5 )
rhel 7 doesn't give me diag_stat in the debug filesystem hmm

On Tue, Nov 3, 2020 at 12:37 PM Christian Borntraeger <
borntrae...@linux.ibm.com> wrote:

> On 03.11.20 14:46, Grzegorz Powiedziuk wrote:
> > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7
> on
> > Z and we are having big performance issues.
> > Same memory, CPU number is down from 12 to 10.  Although they had
> > multithreading ON so they saw more "cpus" We have faster disks (moved to
> > flash), faster FCP cards and faster network adapters.
> > We are running on z114 and at this point that is practically the only VM
> > running with IFLs on this box.
> >
> > It seems that when "jobs" run on their own, they finish faster than what
> > they were getting on AIX.
> > But problems start if there is more than we can chew. So either few jobs
> > running at the same time or some reorgs running in the database.
> >
> > Load average goes to 150-200, cpus are at 100%  (kernel time can go to
> > 20-30% ) but no iowaits.
> > Plenty of memory available.
> > At this point everything becomes extremely slow, people are starting
> having
> > problems with connecting to db2 (annd sshing), basically it becomes a
> > nightmare
> >
> > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes
> > running on the same host). We moved it like this 1:1 from that old AIX.
> >
> > DB2 is running on the ext4 filesystem (Actually a huge number of
> > filesystems- each NODE is a separate logical volume). Separate for logs,
> > data.
> >
> > If this continues like this, we will add 2 cpus but I have a feeling that
> > it will not make much difference.
> >
> > I know that we end up with a massive number of processes and a massive
> > number of file descriptors (lsof sice it shows also threads now, is
> > practically useless - it would run for way too long - 10-30 minutes
> > probably) .
> >
> > A snapshot from just now:
> >
> > top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29,
> 151.07,
> > 133.54
> > Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
> > %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
> >  2.9 st
> > %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
> >  0.6 st
> > %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> >  0.3 st
> > %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> >  0.3 st
> > %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
> >  0.6 st
> > %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
> >  0.3 st
> > %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
> >  0.6 st
> > %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
> >  0.6 st
> > %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
> >  0.6 st
> > %Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> >  0.6 st
> > KiB Mem : 15424256+total,  1069280 free, 18452168 used,
> 13472112+buff/cache
> > KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail
> Mem
>
> So you at least had some time where you paged out memory.
> If you have sysstat installed it would be good to get some history data of
> cpu and swap.
>
> You can also run "vmstat 1 -w" to get an online you on the system load.
> Can you also check (as root)
> /sys/kernel/debug/diag_stat
> 2 times and see if you see excessive diagnose 9c rates.
>
> >
> > Where  can I look for potential relief? Everyone was hoping for a better
> > performance not worse.I am hoping that there is something we can tweak to
> > make this better.
> > I will appreciate any ideas!
>
> I agree this should have gotten faster, not slower.
>
> If you have an IBM service contract (or any other vendor that provides
> support)
> you could open a service ticket to get this analysed.
>
> Christian
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Jim Elliott
Gregory,

The 9117-MMD could range from 1 chip/4 cores all the way up to 16 chips/64
cores at either 3.80 or 4.22 GHz. If it has 15 cores, then it was likely
the 4.22 GHz 5 chip/15 core version. Using 10 out of 15 cores (even at 100%
busy) should fit on 5 z14 ZR1 or z14 M0x IFLs. Sounds like there is
something causing thrashing. Do you have a z/VM performance producs
(Velocity or IBM?) as that might help isolate where the bottleneck is.

Jim Elliott
Senior IT Consultant - GlassHouse Systems Inc.


On Tue, Nov 3, 2020 at 12:25 PM Grzegorz Powiedziuk 
wrote:

> Hi Jim,
> correction - we have z14 not z114
>  .. .not sure why I keep calling our z14 z114 ;)  We have z14
>
> We have 16 IFLs in total shared across 5 z/VM lparps but really there is
> literally nothing running in there yet beside this one huge VM which has 10
> IFLs configured. We have plenty of spare memory left and this one VM has
> 150G configured. It is about the same as what they had for this database
> when it was on the AIX. This number has been calculated by DBAs and it
> seems ok. I am not sure how to tell if DB2 is happy with what it has or
> not, but the linux OS is definitely not starving for memory.
>
> that p7 was 9117-MMD. And I just found that it was had EC set to 10 but it
> could pull up to 15 processors. I am not sure how that works over there.
>
>
>
> On Tue, Nov 3, 2020 at 10:58 AM Jim Elliott  wrote:
>
> > Gregory:
> >
> > Do you have a z114 with 10 IFLs? That is the maximum number of IFLs
> > available on a z114 (2818-M10) and would be unusual. Is this a single
> z/VM
> > LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was
> the
> > specific MT/Model for the P7 box?
> >
> > If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114
> the
> > Power system has 1.4 to 2.0 times the capacity of the z114.
> >
> > Jim Elliott
> > Senior IT Consultant - GlassHouse Systems Inc.
> >
> >
> > On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk <
> gpowiedz...@gmail.com>
> > wrote:
> >
> > > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to
> rhel7
> > on
> > > Z and we are having big performance issues.
> > > Same memory, CPU number is down from 12 to 10.  Although they had
> > > multithreading ON so they saw more "cpus" We have faster disks (moved
> to
> > > flash), faster FCP cards and faster network adapters.
> > > We are running on z114 and at this point that is practically the only
> VM
> > > running with IFLs on this box.
> > >
> > > It seems that when "jobs" run on their own, they finish faster than
> what
> > > they were getting on AIX.
> > > But problems start if there is more than we can chew. So either few
> jobs
> > > running at the same time or some reorgs running in the database.
> > >
> > > Load average goes to 150-200, cpus are at 100%  (kernel time can go to
> > > 20-30% ) but no iowaits.
> > > Plenty of memory available.
> > > At this point everything becomes extremely slow, people are starting
> > having
> > > problems with connecting to db2 (annd sshing), basically it becomes a
> > > nightmare
> > >
> > > This db2 is massive (30+TB) and it is a multinode configuration (17
> nodes
> > > running on the same host). We moved it like this 1:1 from that old AIX.
> > >
> > > DB2 is running on the ext4 filesystem (Actually a huge number of
> > > filesystems- each NODE is a separate logical volume). Separate for
> logs,
> > > data.
> > >
> > > If this continues like this, we will add 2 cpus but I have a feeling
> that
> > > it will not make much difference.
> > >
> > > I know that we end up with a massive number of processes and a massive
> > > number of file descriptors (lsof sice it shows also threads now, is
> > > practically useless - it would run for way too long - 10-30 minutes
> > > probably) .
> > >
> > > A snapshot from just now:
> > >
> > > top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29,
> > 151.07,
> > > 133.54
> > > Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
> > > %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
> > >  2.9 st
> > > %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
> > >  0.6 st
> > > %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> > >  0.3 st
> > > %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> > >  0.3 st
> > > %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
> > >  0.6 st
> > > %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
> > >  0.3 st
> > > %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
> > >  0.6 st
> > > %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
> > >  0.6 st
> > > %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
> > >  0.6 st
> > > %Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> > >  0.6 st
> > > KiB Mem : 15424256+total,  1069280 free, 18452168 

Re: performance problems db2 after moving from AIX

2020-11-03 Thread r.stricklin
On Nov 3, 2020, at 5:46 AM, Grzegorz Powiedziuk wrote:

> DB2 is running on the ext4 filesystem (Actually a huge number of
> filesystems- each NODE is a separate logical volume). Separate for logs,
> data.

I recently had a vaguely similar problem with a much smaller database on linux 
(x86, mysql for zabbix) that presented bizarre performance issues despite 
clearly having lots of resources left available.

What our problem ended up being was the linux block i/o scheduler deciding to 
defer i/o based on seek avoidance, and under heavy database use this was 
causing havoc with mysql's ability to complete transactions. It seems absurd to 
preferentially avoid seeks when you have SSD. The problems vanished instantly 
when we changed the i/o scheduler on the SSD block devices to 'noop'.

I think it's worth checking in your case.


ok
bear.

-- 
until further notice

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Tully, Phil (CORP)
Can you provide the output of 
Q MULTITHREAD and Q SRM

Regards, Phil Tully
Chief Architect-z/VM & z/Linux
phil.tu...@adp.com
cell: 973-202-7427
1-800 377-0237,,5252697#



On 11/3/20, 8:47 AM, "Linux on 390 Port on behalf of Grzegorz Powiedziuk" 
 wrote:


WARNING: Do not click links or open attachments unless you recognize the 
source of the email and know the contents are safe. 

**
Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on

Z and we are having big performance issues.

Same memory, CPU number is down from 12 to 10.  Although they had

multithreading ON so they saw more "cpus" We have faster disks (moved to

flash), faster FCP cards and faster network adapters.

We are running on z114 and at this point that is practically the only VM

running with IFLs on this box.



It seems that when "jobs" run on their own, they finish faster than what

they were getting on AIX.

But problems start if there is more than we can chew. So either few jobs

running at the same time or some reorgs running in the database.



Load average goes to 150-200, cpus are at 100%  (kernel time can go to

20-30% ) but no iowaits.

Plenty of memory available.

At this point everything becomes extremely slow, people are starting having

problems with connecting to db2 (annd sshing), basically it becomes a

nightmare



This db2 is massive (30+TB) and it is a multinode configuration (17 nodes

running on the same host). We moved it like this 1:1 from that old AIX.



DB2 is running on the ext4 filesystem (Actually a huge number of

filesystems- each NODE is a separate logical volume). Separate for logs,

data.



If this continues like this, we will add 2 cpus but I have a feeling that

it will not make much difference.



I know that we end up with a massive number of processes and a massive

number of file descriptors (lsof sice it shows also threads now, is

practically useless - it would run for way too long - 10-30 minutes

probably) .



A snapshot from just now:



top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29, 151.07,

133.54

Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie

%Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,

 2.9 st

%Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,

 0.6 st

%Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,

 0.3 st

%Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,

 0.3 st

%Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,

 0.6 st

%Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,

 0.3 st

%Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,

 0.6 st

%Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,

 0.6 st

%Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,

 0.6 st

%Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,

 0.6 st

KiB Mem : 15424256+total,  1069280 free, 18452168 used, 13472112+buff/cache

KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail Mem



Where  can I look for potential relief? Everyone was hoping for a better

performance not worse.I am hoping that there is something we can tweak to

make this better.

I will appreciate any ideas!

thanks

Gregory



--

For LINUX-390 subscribe / signoff / archive access instructions,

send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or 
visit


https://urldefense.proofpoint.com/v2/url?u=http-3A__www2.marist.edu_htbin_wlvindex-3FLINUX-2D390=DwIBaQ=xu_5lAfKHjInGFR3ndoZrw=YINhRQBnBpDTaiDeBIX-uouJy__emITEWU-E34BcxvU=SUeq7nloampLUlcudHLlDllRMQk8sapBSXhim0FFfbU=O5QoJ1ORESabvMAsvoNl3rqMG5HuqEdReIffDAH3p2E=
 



--
This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, notify the sender immediately by return email and delete the message 
and any attachments from your system.


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit

Re: performance problems db2 after moving from AIX

2020-11-03 Thread Christian Borntraeger
On 03.11.20 14:46, Grzegorz Powiedziuk wrote:
> Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on
> Z and we are having big performance issues.
> Same memory, CPU number is down from 12 to 10.  Although they had
> multithreading ON so they saw more "cpus" We have faster disks (moved to
> flash), faster FCP cards and faster network adapters.
> We are running on z114 and at this point that is practically the only VM
> running with IFLs on this box.
>
> It seems that when "jobs" run on their own, they finish faster than what
> they were getting on AIX.
> But problems start if there is more than we can chew. So either few jobs
> running at the same time or some reorgs running in the database.
>
> Load average goes to 150-200, cpus are at 100%  (kernel time can go to
> 20-30% ) but no iowaits.
> Plenty of memory available.
> At this point everything becomes extremely slow, people are starting having
> problems with connecting to db2 (annd sshing), basically it becomes a
> nightmare
>
> This db2 is massive (30+TB) and it is a multinode configuration (17 nodes
> running on the same host). We moved it like this 1:1 from that old AIX.
>
> DB2 is running on the ext4 filesystem (Actually a huge number of
> filesystems- each NODE is a separate logical volume). Separate for logs,
> data.
>
> If this continues like this, we will add 2 cpus but I have a feeling that
> it will not make much difference.
>
> I know that we end up with a massive number of processes and a massive
> number of file descriptors (lsof sice it shows also threads now, is
> practically useless - it would run for way too long - 10-30 minutes
> probably) .
>
> A snapshot from just now:
>
> top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29, 151.07,
> 133.54
> Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
> %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
>  2.9 st
> %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
>  0.6 st
> %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.3 st
> %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.3 st
> %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
>  0.6 st
> %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
>  0.3 st
> %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
>  0.6 st
> %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
>  0.6 st
> %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
>  0.6 st
> %Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.6 st
> KiB Mem : 15424256+total,  1069280 free, 18452168 used, 13472112+buff/cache
> KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail Mem

So you at least had some time where you paged out memory.
If you have sysstat installed it would be good to get some history data of
cpu and swap.

You can also run "vmstat 1 -w" to get an online you on the system load.
Can you also check (as root)
/sys/kernel/debug/diag_stat
2 times and see if you see excessive diagnose 9c rates.

>
> Where  can I look for potential relief? Everyone was hoping for a better
> performance not worse.I am hoping that there is something we can tweak to
> make this better.
> I will appreciate any ideas!

I agree this should have gotten faster, not slower.

If you have an IBM service contract (or any other vendor that provides support)
you could open a service ticket to get this analysed.

Christian

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
Hi Jim,
correction - we have z14 not z114
 .. .not sure why I keep calling our z14 z114 ;)  We have z14

We have 16 IFLs in total shared across 5 z/VM lparps but really there is
literally nothing running in there yet beside this one huge VM which has 10
IFLs configured. We have plenty of spare memory left and this one VM has
150G configured. It is about the same as what they had for this database
when it was on the AIX. This number has been calculated by DBAs and it
seems ok. I am not sure how to tell if DB2 is happy with what it has or
not, but the linux OS is definitely not starving for memory.

that p7 was 9117-MMD. And I just found that it was had EC set to 10 but it
could pull up to 15 processors. I am not sure how that works over there.



On Tue, Nov 3, 2020 at 10:58 AM Jim Elliott  wrote:

> Gregory:
>
> Do you have a z114 with 10 IFLs? That is the maximum number of IFLs
> available on a z114 (2818-M10) and would be unusual. Is this a single z/VM
> LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was the
> specific MT/Model for the P7 box?
>
> If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 the
> Power system has 1.4 to 2.0 times the capacity of the z114.
>
> Jim Elliott
> Senior IT Consultant - GlassHouse Systems Inc.
>
>
> On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk 
> wrote:
>
> > Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7
> on
> > Z and we are having big performance issues.
> > Same memory, CPU number is down from 12 to 10.  Although they had
> > multithreading ON so they saw more "cpus" We have faster disks (moved to
> > flash), faster FCP cards and faster network adapters.
> > We are running on z114 and at this point that is practically the only VM
> > running with IFLs on this box.
> >
> > It seems that when "jobs" run on their own, they finish faster than what
> > they were getting on AIX.
> > But problems start if there is more than we can chew. So either few jobs
> > running at the same time or some reorgs running in the database.
> >
> > Load average goes to 150-200, cpus are at 100%  (kernel time can go to
> > 20-30% ) but no iowaits.
> > Plenty of memory available.
> > At this point everything becomes extremely slow, people are starting
> having
> > problems with connecting to db2 (annd sshing), basically it becomes a
> > nightmare
> >
> > This db2 is massive (30+TB) and it is a multinode configuration (17 nodes
> > running on the same host). We moved it like this 1:1 from that old AIX.
> >
> > DB2 is running on the ext4 filesystem (Actually a huge number of
> > filesystems- each NODE is a separate logical volume). Separate for logs,
> > data.
> >
> > If this continues like this, we will add 2 cpus but I have a feeling that
> > it will not make much difference.
> >
> > I know that we end up with a massive number of processes and a massive
> > number of file descriptors (lsof sice it shows also threads now, is
> > practically useless - it would run for way too long - 10-30 minutes
> > probably) .
> >
> > A snapshot from just now:
> >
> > top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29,
> 151.07,
> > 133.54
> > Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
> > %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
> >  2.9 st
> > %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
> >  0.6 st
> > %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> >  0.3 st
> > %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> >  0.3 st
> > %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
> >  0.6 st
> > %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
> >  0.3 st
> > %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
> >  0.6 st
> > %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
> >  0.6 st
> > %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
> >  0.6 st
> > %Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
> >  0.6 st
> > KiB Mem : 15424256+total,  1069280 free, 18452168 used,
> 13472112+buff/cache
> > KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail
> Mem
> >
> > Where  can I look for potential relief? Everyone was hoping for a better
> > performance not worse.I am hoping that there is something we can tweak to
> > make this better.
> > I will appreciate any ideas!
> > thanks
> > Gregory
> >
> > --
> > For LINUX-390 subscribe / signoff / archive access instructions,
> > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> > visit
> > http://www2.marist.edu/htbin/wlvindex?LINUX-390
> >
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to 

Re: performance problems db2 after moving from AIX

2020-11-03 Thread Jim Elliott
Gregory:

Do you have a z114 with 10 IFLs? That is the maximum number of IFLs
available on a z114 (2818-M10) and would be unusual. Is this a single z/VM
LPAR? How much memory is on the z114 (and in this LPAR)? Also, what was the
specific MT/Model for the P7 box?

If you were to compare a 12-core Power 730 (8231-E2C) to a 10 IFL z114 the
Power system has 1.4 to 2.0 times the capacity of the z114.

Jim Elliott
Senior IT Consultant - GlassHouse Systems Inc.


On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk 
wrote:

> Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on
> Z and we are having big performance issues.
> Same memory, CPU number is down from 12 to 10.  Although they had
> multithreading ON so they saw more "cpus" We have faster disks (moved to
> flash), faster FCP cards and faster network adapters.
> We are running on z114 and at this point that is practically the only VM
> running with IFLs on this box.
>
> It seems that when "jobs" run on their own, they finish faster than what
> they were getting on AIX.
> But problems start if there is more than we can chew. So either few jobs
> running at the same time or some reorgs running in the database.
>
> Load average goes to 150-200, cpus are at 100%  (kernel time can go to
> 20-30% ) but no iowaits.
> Plenty of memory available.
> At this point everything becomes extremely slow, people are starting having
> problems with connecting to db2 (annd sshing), basically it becomes a
> nightmare
>
> This db2 is massive (30+TB) and it is a multinode configuration (17 nodes
> running on the same host). We moved it like this 1:1 from that old AIX.
>
> DB2 is running on the ext4 filesystem (Actually a huge number of
> filesystems- each NODE is a separate logical volume). Separate for logs,
> data.
>
> If this continues like this, we will add 2 cpus but I have a feeling that
> it will not make much difference.
>
> I know that we end up with a massive number of processes and a massive
> number of file descriptors (lsof sice it shows also threads now, is
> practically useless - it would run for way too long - 10-30 minutes
> probably) .
>
> A snapshot from just now:
>
> top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29, 151.07,
> 133.54
> Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
> %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
>  2.9 st
> %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
>  0.6 st
> %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.3 st
> %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.3 st
> %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
>  0.6 st
> %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
>  0.3 st
> %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
>  0.6 st
> %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
>  0.6 st
> %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
>  0.6 st
> %Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.6 st
> KiB Mem : 15424256+total,  1069280 free, 18452168 used, 13472112+buff/cache
> KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail Mem
>
> Where  can I look for potential relief? Everyone was hoping for a better
> performance not worse.I am hoping that there is something we can tweak to
> make this better.
> I will appreciate any ideas!
> thanks
> Gregory
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: performance problems db2 after moving from AIX

2020-11-03 Thread Robert J Brenneman
you've got a gig of swap used , and you said %system CPU time is way higher
while the system becomes unusable ?
are you actively swapping during that time when the system is not
responsive? If yes you need to try to either add memory or reduce the
memory demand on the system.

On Tue, Nov 3, 2020 at 8:47 AM Grzegorz Powiedziuk 
wrote:

> Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on
> Z and we are having big performance issues.
> Same memory, CPU number is down from 12 to 10.  Although they had
> multithreading ON so they saw more "cpus" We have faster disks (moved to
> flash), faster FCP cards and faster network adapters.
> We are running on z114 and at this point that is practically the only VM
> running with IFLs on this box.
>
> It seems that when "jobs" run on their own, they finish faster than what
> they were getting on AIX.
> But problems start if there is more than we can chew. So either few jobs
> running at the same time or some reorgs running in the database.
>
> Load average goes to 150-200, cpus are at 100%  (kernel time can go to
> 20-30% ) but no iowaits.
> Plenty of memory available.
> At this point everything becomes extremely slow, people are starting having
> problems with connecting to db2 (annd sshing), basically it becomes a
> nightmare
>
> This db2 is massive (30+TB) and it is a multinode configuration (17 nodes
> running on the same host). We moved it like this 1:1 from that old AIX.
>
> DB2 is running on the ext4 filesystem (Actually a huge number of
> filesystems- each NODE is a separate logical volume). Separate for logs,
> data.
>
> If this continues like this, we will add 2 cpus but I have a feeling that
> it will not make much difference.
>
> I know that we end up with a massive number of processes and a massive
> number of file descriptors (lsof sice it shows also threads now, is
> practically useless - it would run for way too long - 10-30 minutes
> probably) .
>
> A snapshot from just now:
>
> top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29, 151.07,
> 133.54
> Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
> %Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
>  2.9 st
> %Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
>  0.6 st
> %Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.3 st
> %Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.3 st
> %Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
>  0.6 st
> %Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
>  0.3 st
> %Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
>  0.6 st
> %Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
>  0.6 st
> %Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
>  0.6 st
> %Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
>  0.6 st
> KiB Mem : 15424256+total,  1069280 free, 18452168 used, 13472112+buff/cache
> KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail Mem
>
> Where  can I look for potential relief? Everyone was hoping for a better
> performance not worse.I am hoping that there is something we can tweak to
> make this better.
> I will appreciate any ideas!
> thanks
> Gregory
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www2.marist.edu/htbin/wlvindex?LINUX-390
>


--
Jay Brenneman

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


performance problems db2 after moving from AIX

2020-11-03 Thread Grzegorz Powiedziuk
Hi, I could use some ideas. We moved a huge db2 from old p7 aix to rhel7 on
Z and we are having big performance issues.
Same memory, CPU number is down from 12 to 10.  Although they had
multithreading ON so they saw more "cpus" We have faster disks (moved to
flash), faster FCP cards and faster network adapters.
We are running on z114 and at this point that is practically the only VM
running with IFLs on this box.

It seems that when "jobs" run on their own, they finish faster than what
they were getting on AIX.
But problems start if there is more than we can chew. So either few jobs
running at the same time or some reorgs running in the database.

Load average goes to 150-200, cpus are at 100%  (kernel time can go to
20-30% ) but no iowaits.
Plenty of memory available.
At this point everything becomes extremely slow, people are starting having
problems with connecting to db2 (annd sshing), basically it becomes a
nightmare

This db2 is massive (30+TB) and it is a multinode configuration (17 nodes
running on the same host). We moved it like this 1:1 from that old AIX.

DB2 is running on the ext4 filesystem (Actually a huge number of
filesystems- each NODE is a separate logical volume). Separate for logs,
data.

If this continues like this, we will add 2 cpus but I have a feeling that
it will not make much difference.

I know that we end up with a massive number of processes and a massive
number of file descriptors (lsof sice it shows also threads now, is
practically useless - it would run for way too long - 10-30 minutes
probably) .

A snapshot from just now:

top - 08:37:50 up 11 days, 12:04, 28 users,  load average: 188.29, 151.07,
133.54
Tasks: 1843 total,  11 running, 1832 sleeping,   0 stopped,   0 zombie
%Cpu0  : 76.3 us, 16.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  1.0 hi,  3.2 si,
 2.9 st
%Cpu1  : 66.1 us, 31.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
 0.6 st
%Cpu2  : 66.9 us, 31.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
 0.3 st
%Cpu3  : 74.7 us, 23.4 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
 0.3 st
%Cpu4  : 86.7 us, 10.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.3 si,
 0.6 st
%Cpu5  : 83.8 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
 0.3 st
%Cpu6  : 81.6 us, 15.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
 0.6 st
%Cpu7  : 70.6 us, 26.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.9 si,
 0.6 st
%Cpu8  : 70.5 us, 26.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.6 hi,  1.6 si,
 0.6 st
%Cpu9  : 84.1 us, 13.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.3 hi,  1.3 si,
 0.6 st
KiB Mem : 15424256+total,  1069280 free, 18452168 used, 13472112+buff/cache
KiB Swap: 52305904 total, 51231216 free,  1074688 used. 17399028 avail Mem

Where  can I look for potential relief? Everyone was hoping for a better
performance not worse.I am hoping that there is something we can tweak to
make this better.
I will appreciate any ideas!
thanks
Gregory

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390