Re: CP overhead of using FCP attached SCSI SAN

2020-11-05 Thread Rob van der Heij
On Thu, 5 Nov 2020 at 16:23, Dave Jones  wrote:

> So handling this approximately 1K times could drive the %steal up, I
> think.
>

That appears to be nmon failing, probably picking up a stale pointer or
following some bunny trail. It's something to pick up with Nigel. It does
not directly contribute to CP overhead, but it's possible that the recovery
consumes good CPU resources that make the guest compete harder for CPU
resources to get the real work done, and find more contention when getting
beyond allocated capacity.

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: CP overhead of using FCP attached SCSI SAN

2020-11-05 Thread Dave Jones

No offense taking, Rob! :-)
We now have another clue, this time from the console log. This error
keeps recurring:

[82053.035436] User process fault: interruption code 0x60010 in
nmon_mainframe_6
[82053.035448] failing address: 3FFFD379000
[82053.035453] CPU: 24 PID: 17149 Comm: nmon_mainframe_ Kdump: loaded
Not tainte
[82053.035456] task: 0041402d8000 ti: 0042b55d8000 task.ti:
0042b55d
[82053.035459] User PSW : 070520018000 80007114 (0x80007114)
[82053.035477] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:
User GPRS: 441c 0073e750 00670680
00077000
[82053.035484] afe8da70 afe89b90 80067a50 000
[82053.035488] 01f9 03fffcbc1440 00035430 000
[82053.035492] 01fc 800282b0 0672 000
[82053.035505] User Code: 80007106: eb210003000d sllg %r2,%r1,
8000710c: b9080012 agr %r1,%r2
#80007110: 41717000 la %r7,0(%r1,%r7)

80007114: e31071a4 lg %r1,416(%r7)

8000711a: ec1300442065 clgrj %r1,%r3,2,800071a2
80007120: b9e91013 sgrk %r1,%r3,%r1
80007124: e32071a80004 lg %r2,424(%r7)
8000712a: e33091a80004 lg %r3,424(%r9)
[82053.035558] Last Breaking-Event-Address:
[82053.035560] <8000707e> 0x8000707e

So handling this approximately 1K times could drive the %steal up, I
think.
Thanks again.
DJ
---
DAVID JONES | MANAGING DIRECTOR FOR ZSYSTEMS SERVICES | z/VM, Linux, and
Cloud
703.237.7370 (Office) | 281.578.7544 (CELL)

INFORMATION TECHNOLOGY COMPANY

On 11.05.2020 12:48 AM, Rob van der Heij wrote:

On Wed, 4 Nov 2020 at 20:06, Rob van der Heij 
wrote:



On Wed, 4 Nov 2020 at 19:30, Dave Jones  wrote

What is the CP overhead of managing this?  The one Linux guest that is

running here reports a %steal of 15-17%, which I think is a bit high.




At the risk of teaching granny... the most common cause for reported
high
"steal percentage" in Linux is not CP overhead on I/O, but contention
for
CPU resources on z/VM and PR/SM level.
And when you see high %steal combined with %idle in the guest, then it
may
well be due to an application causing a high amount of polling (and
lowering the number of virtual CPUs closer to the allocated capacity
may be
helpful).

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390
or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: CP overhead of using FCP attached SCSI SAN

2020-11-05 Thread Rob van der Heij
On Wed, 4 Nov 2020 at 20:06, Rob van der Heij  wrote:

>
> On Wed, 4 Nov 2020 at 19:30, Dave Jones  wrote
>
> What is the CP overhead of managing this?  The one Linux guest that is
>> running here reports a %steal of 15-17%, which I think is a bit high.
>>
>
At the risk of teaching granny... the most common cause for reported high
"steal percentage" in Linux is not CP overhead on I/O, but contention for
CPU resources on z/VM and PR/SM level.
And when you see high %steal combined with %idle in the guest, then it may
well be due to an application causing a high amount of polling (and
lowering the number of virtual CPUs closer to the allocated capacity may be
helpful).

Rob

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: CP overhead of using FCP attached SCSI SAN

2020-11-04 Thread Rob van der Heij
On Wed, 4 Nov 2020 at 19:30, Dave Jones  wrote

What is the CP overhead of managing this?  The one Linux guest that is
> running here reports a %steal of 15-17%, which I think is a bit high.
> Could this be configured better?
> Thanks, appreciate it.


You’re right that EDEV overhead would show in the guest’s TTIME and thus
noticed by Linux as steal. Your configuration with application data on FCP
and just OS on EDEV is good practice.

I would not expect OS disk I/O as high that you notice, unless when you’re
very tight on memory and drop libraries and executables from page cache as
soon as the program terminates. You’d see those things with stateless
agents that fire up a program every 10 seconds to report some data.

My rules of thumb are too dated, but you should look with vmstat -d or
other tools whether there really is enough I/O to worry. Then look at what
is being read or written (application logs could be an issue and can be
moved to FCP)

Rob

>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


Re: CP overhead of using FCP attached SCSI SAN

2020-11-04 Thread barton
I don't see how steal time and the scsi would be related. There's an old 
article about what steal time is at "http://velocitysoftware.com/STEAL.html;



On 11/4/2020 10:30 AM, Dave Jones wrote:

Hello, all.

I have a site that is using SCSI SAN (a V7000) to hold all z/VM
storagethe system z/VM and Linux software is installed on emulated
FBA dasd, and the Oracle database is stored on 360 or so SCSI disks,
attached via FCP.

What is the CP overhead of managing this?  The one Linux guest that is
running here reports a %steal of 15-17%, which I think is a bit high.
Could this be configured better?
Thanks, appreciate it.
DJ

--
DAVID JONES | MANAGING DIRECTOR FOR ZSYSTEMS SERVICES | z/VM, Linux, and
Cloud
703.237.7370 (Office) | 281.578.7544 (CELL)

INFORMATION TECHNOLOGY COMPANY

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 
or visit

http://www2.marist.edu/htbin/wlvindex?LINUX-390




--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390