AW: AW: Problems with SCSI FBA device under z/VM

2011-05-05 Thread Heiko Ruck
Thanks for your answer. The Linux was actually shutdown when we tried to vary 
the EDEVice online.

 

Which possibilities do we have to trace a EDEVice in z/VM?

We performed a scsidisc debug and saw the following:

INFO::FCP SUB-CHANNEL 2000 Re-Initialized

INFO::WWPN 21D02317 Opened   

INFO::WWPN 21D02317 UTIL LUN Opened  

INFO::WWPN 21D02317 UTIL LUN Closed  

INFO::For WWPN 21D02317 No of LUNs found=8   

INFO::For 2000  21D02317 Choosen LUN=   

DEBUG::LUN  for WWPN 21D02317 Opened 

DEBUG::LUN  Closed   

INFO::WWPN 21D02317 Closed   

 

We will also open a service request at IBM.

 

Regards,

Heiko Ruck

 

Von: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU] Im Auftrag 
von Eric R Farman
Gesendet: Mittwoch, 4. Mai 2011 17:28
An: IBMVM@LISTSERV.UARK.EDU
Betreff: Re: AW: Problems with SCSI FBA device under z/VM

 

Have you removed the configuration of the LUN from Linux before you varied the 
EDEVice online?  z/VM will not be able to open a LUN R/W if it is already open 
on another FCP CHPID/WWPN pair.  Doesn't matter if it's a different subchannel, 
if it's on the same CHPID. 

If that's not it, I would suggest contacting opening a service request so that 
we can collect some traces and determine why the LUN can't be opened.  Non-IBM 
storage should work, but the UDID Mismatch message you should only get with 
multiple path devices, not single-path. 

Regards,
   Eric

Eric Farman
z/VM I/O Development
IBM Endicott, NY




From: 

Heiko Ruck hr...@mainstorconcept.de 

To: 

IBMVM@LISTSERV.UARK.EDU 

Date: 

05/04/2011 03:55 PM 

Subject: 

AW: Problems with SCSI FBA device under z/VM 

Sent by: 

The IBM z/VM Operating System IBMVM@LISTSERV.UARK.EDU

 






We used another subchannel and another LUN for z/Linux - device 2000 was used 
for z/VM (edev 9000) with LUN1 and device 2002 was used for z/Linux with LUN0.
We tried it with and without NPIV, it works in both modes for Linux but not for 
z/VM.
The PPP in the WWPN can be ignored (changed the WWPN after copypaste, sorry 
for the confusion).

-Ursprüngliche Nachricht-
Von: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU 
mailto:IBMVM@LISTSERV.UARK.EDU ] Im Auftrag von Richard Troth
Gesendet: Mittwoch, 4. Mai 2011 15:29
An: IBMVM@LISTSERV.UARK.EDU
Betreff: Re: Problems with SCSI FBA device under z/VM

So ... when it works on Linux, how is it defined?  Using the same
subchannel?  (2000)   Is NPIV in place?  What's with the PPP in the
WWPN in the error report?  Is Linux also using this device when you try to use 
it as an EDEV?


-- R;   
Rick Troth
Velocity Software
http://www.velocitysoftware.com/ http://www.velocitysoftware.com/ 








On Wed, May 4, 2011 at 08:58, Heiko Ruck hr...@mainstorconcept.de 
mailto:hr...@mainstorconcept.de  wrote:
 Hi all,



 we have Problems with using our SCSI-Storage System under z/VM defined 
 as FCP. The same Storage System works fine for our z/Linux running as 
 a guest of this VM, so we assume that there are no errors in the 
 definition of IOCP, HMC, FibreChannel Switch and SCSI Storage System. 
 We are currently using z/VM 5.4 with Service Level 1101 and we can see 
 the WWPN and LUNs of the storage when using scsidisc. The set edev 
 command does also work, but as soon as we vary the device online we receive 
 the following error:

 HCPSZP8701I Path FCP_DEV 2000 WWPN 2200PPP023100017 LUN 
 0001 was deleted from EDEV 9000 because it is invalid.


 HCPCPN8700I Emulated Device 9000 cannot be varied online because

 HCPCPN8700I there are no valid paths defined to the device.

 There are no error is in the Operator console. And the same tests in 
 z/VM
 5.3 ends with:

 11:51:01 HCPAXS3586E UDID mismatch for path 6000PATH01.

  Function:scsi_mpio_init

 11:51:01 HCPCSS3507E Thread (0x01B9C148) encountered a severe 
 error in p

 ers_paix_loadDevice at line 1604: EID(3)
 RC(0x00

 15) RSN(0x0033)





 Does anyone have an idea why we receive an error in z/VM although it 
 works under z/Linux?

 Of course I have to say that this is not a IBM Storage System. Does 
 anyone have a list of Storage Systems which work with z/VM or 
 experiences with a System which works? I think supported are only IBM 
 2105, 2107, 1750, 2145 and 2810 but with which other systems is it 
 also functional even if it is not supported and/or from another vendor?





 Thanks,

 Heiko Ruck

 Checked by MSC FGT60


Checked by MSC FGT60





Re: AW: AW: Problems with SCSI FBA device under z/VM

2011-05-05 Thread Eric R Farman
In your original note, you were asking to open LUN x0001 as 
part of an EDEV.  But in your SCSIDISC DEBUG, you are opening LUN 
x.  Can you try the same debug against that LUN?

Regards,
Eric

Eric Farman
z/VM I/O Development
IBM Endicott, NY
(607)429-4958 (tie 620)



From:
Heiko Ruck hr...@mainstorconcept.de
To:
IBMVM@LISTSERV.UARK.EDU
Date:
05/05/2011 11:13 AM
Subject:
AW: AW: Problems with SCSI FBA device under z/VM
Sent by:
The IBM z/VM Operating System IBMVM@LISTSERV.UARK.EDU



Thanks for your answer. The Linux was actually shutdown when we tried to 
vary the EDEVice online.
 
Which possibilities do we have to trace a EDEVice in z/VM?
We performed a scsidisc debug and saw the following:
INFO::FCP SUB-CHANNEL 2000 Re-Initialized
INFO::WWPN 21D02317 Opened   
INFO::WWPN 21D02317 UTIL LUN Opened  
INFO::WWPN 21D02317 UTIL LUN Closed  
INFO::For WWPN 21D02317 No of LUNs found=8   
INFO::For 2000  21D02317 Choosen LUN=   
DEBUG::LUN  for WWPN 21D02317 Opened 
DEBUG::LUN  Closed   
INFO::WWPN 21D02317 Closed   
 
We will also open a service request at IBM.
 
Regards,
Heiko Ruck
 
Von: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU] Im 
Auftrag von Eric R Farman
Gesendet: Mittwoch, 4. Mai 2011 17:28
An: IBMVM@LISTSERV.UARK.EDU
Betreff: Re: AW: Problems with SCSI FBA device under z/VM
 
Have you removed the configuration of the LUN from Linux before you varied 
the EDEVice online?  z/VM will not be able to open a LUN R/W if it is 
already open on another FCP CHPID/WWPN pair.  Doesn't matter if it's a 
different subchannel, if it's on the same CHPID. 

If that's not it, I would suggest contacting opening a service request so 
that we can collect some traces and determine why the LUN can't be opened. 
 Non-IBM storage should work, but the UDID Mismatch message you should 
only get with multiple path devices, not single-path. 

Regards,
   Eric

Eric Farman
z/VM I/O Development
IBM Endicott, NY



From: 
Heiko Ruck hr...@mainstorconcept.de 
To: 
IBMVM@LISTSERV.UARK.EDU 
Date: 
05/04/2011 03:55 PM 
Subject: 
AW: Problems with SCSI FBA device under z/VM 
Sent by: 
The IBM z/VM Operating System IBMVM@LISTSERV.UARK.EDU
 




We used another subchannel and another LUN for z/Linux - device 2000 was 
used for z/VM (edev 9000) with LUN1 and device 2002 was used for z/Linux 
with LUN0.
We tried it with and without NPIV, it works in both modes for Linux but 
not for z/VM.
The PPP in the WWPN can be ignored (changed the WWPN after copypaste, 
sorry for the confusion).

-Ursprüngliche Nachricht-
Von: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU] Im 
Auftrag von Richard Troth
Gesendet: Mittwoch, 4. Mai 2011 15:29
An: IBMVM@LISTSERV.UARK.EDU
Betreff: Re: Problems with SCSI FBA device under z/VM

So ... when it works on Linux, how is it defined?  Using the same
subchannel?  (2000)   Is NPIV in place?  What's with the PPP in the
WWPN in the error report?  Is Linux also using this device when you try to 
use it as an EDEV?


-- R;   
Rick Troth
Velocity Software
http://www.velocitysoftware.com/








On Wed, May 4, 2011 at 08:58, Heiko Ruck hr...@mainstorconcept.de wrote:
 Hi all,



 we have Problems with using our SCSI-Storage System under z/VM defined 
 as FCP. The same Storage System works fine for our z/Linux running as 
 a guest of this VM, so we assume that there are no errors in the 
 definition of IOCP, HMC, FibreChannel Switch and SCSI Storage System. 
 We are currently using z/VM 5.4 with Service Level 1101 and we can see 
 the WWPN and LUNs of the storage when using scsidisc. The set edev 
 command does also work, but as soon as we vary the device online we 
receive the following error:

 HCPSZP8701I Path FCP_DEV 2000 WWPN 2200PPP023100017 LUN 
 0001 was deleted from EDEV 9000 because it is invalid.


 HCPCPN8700I Emulated Device 9000 cannot be varied online because

 HCPCPN8700I there are no valid paths defined to the device.

 There are no error is in the Operator console. And the same tests in 
 z/VM
 5.3 ends with:

 11:51:01 HCPAXS3586E UDID mismatch for path 6000PATH01.

  Function:scsi_mpio_init

 11:51:01 HCPCSS3507E Thread (0x01B9C148) encountered a severe 
 error in p

 ers_paix_loadDevice at line 1604: EID(3)
 RC(0x00

 15) RSN(0x0033)





 Does anyone have an idea why we receive an error in z/VM although it 
 works under z/Linux?

 Of course I have to say that this is not a IBM Storage System. Does 
 anyone have a list of Storage Systems which work with z/VM or 
 experiences with a System which works? I think supported are only IBM 
 2105, 2107, 1750, 2145 and 2810 but with which 

Re: Problem with z/Linux guest Ethernet frames (buffering ?)

2011-05-05 Thread Bhemidhi, Ashwin
Hello Scott,

We are using Layer 2 Vswitch for the guest and the MTU size is set to 1492. 
None of the Ethernet frames are not being dropped by they are being delayed.

Regards,
Ashwin


From: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU] On Behalf 
Of Scott Rohling
Sent: Wednesday, May 04, 2011 8:12 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: Problem with z/Linux guest Ethernet frames (buffering ?)

Just a guess, but have you checked MTU sizes?   Are you using a VSWITCH for the 
guests or dedicated OSA?

Scott Rohling
On Wed, May 4, 2011 at 5:15 PM, Bhemidhi, Ashwin 
ashw...@ti.commailto:ashw...@ti.com wrote:
Hello all,

Recently we started noticing on few of our Linux guest that Ethernet frames 
were being delayed up to 25 seconds from the time they were sent to the time 
the guest received them.  The frames are Ethernet LLC keep alive polls (layer 2 
poll) that are sent by a Cisco SNA switch router every 30 secs.  Both the 
router and the Linux guest are in the same LAN.

Looking at the ethereal traces captured on the guest. During normal operation 
the keep alive Frames are being sent every 30 secs and the z/Linux guest 
responds to the poll with in 60 micro seconds.  But few times we noticed that 
the frames were being delayed up to 25 seconds (total time from the previous 
poll is 30+25) after the router sends the poll frame to the time the Linux 
guest receives them.  This is causing the keep alive timer ( 9 secs = 1 sec  X 
8 retries) to expire and disconnect sessions.  The Linux guest eventually 
receives the frames including the retires all at the same but by that time the 
sessions are dropped the router. It appears that the frames are being buffered 
and are delayed by the guest receives them.

We for sure know that the router is sending the poll every 30 seconds but some 
were some how the frames were buffered (?) for 25 secs before being delivered 
to the guest.  I am trying to figure at which layer the delay was being 
introduced.  Are there any other traces that I can turn on z/VM to diagnose the 
problem?   Were do I start looking at?

z/VM LPAR is a small one running 8 guest with 80MB memory and 16MB and 48MB 
vdisk on a z10

IFL utilization : 2% X 2 IFLS,
Central Storage  : 95%  768 MB,
XSTORE   : 97%  256MB,
PAGE   : 12% X 2 3390-3 page DASD.

Paging/Spooling activity: 0/s (most of the times)

Thank you,
Ashwin  Bhemidhi



Re: Problem with z/Linux guest Ethernet frames (buffering ?)

2011-05-05 Thread Bhemidhi, Ashwin
Right now there is the eligible list is 0. I could not check the queues at the 
time of event, the 3 times it occurred was either during after business hours 
or over the weekends. By the time I was able to logon the eligible list was 0.

I did change the SRM storbuff setting from the default to 300%, 300%, 200%.

Regards,
Ashwin

q srm
IABIAS : INTENSITY=90%; DURATION=2
LDUBUF : Q1=100% Q2=75% Q3=60%
STORBUF: Q1=300% Q2=300% Q3=200%
DSPBUF : Q1=32767 Q2=32767 Q3=32767
DISPATCHING MINOR TIMESLICE = 5 MS
MAXWSS : LIMIT=%
.. : PAGES=99
XSTORE : 0%
LIMITHARD METHOD: DEADLINE


From: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU] On Behalf 
Of Marcy Cortes
Sent: Wednesday, May 04, 2011 8:37 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: Problem with z/Linux guest Ethernet frames (buffering ?)

Be sure your guest isn't dropping into the eligible list.
That can look like a network problem.
Issue ind q to see if it is there and q srm  or consult your performance 
monitor.
SRM STORBUFF setting default is too low for Linux workloads on z/VM so 
hopefully that has bumped up.


Marcy

From: The IBM z/VM Operating System [mailto:IBMVM@LISTSERV.UARK.EDU] On Behalf 
Of Scott Rohling
Sent: Wednesday, May 04, 2011 6:12 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: [IBMVM] Problem with z/Linux guest Ethernet frames (buffering ?)

Just a guess, but have you checked MTU sizes?   Are you using a VSWITCH for the 
guests or dedicated OSA?

Scott Rohling
On Wed, May 4, 2011 at 5:15 PM, Bhemidhi, Ashwin 
ashw...@ti.commailto:ashw...@ti.com wrote:
Hello all,

Recently we started noticing on few of our Linux guest that Ethernet frames 
were being delayed up to 25 seconds from the time they were sent to the time 
the guest received them.  The frames are Ethernet LLC keep alive polls (layer 2 
poll) that are sent by a Cisco SNA switch router every 30 secs.  Both the 
router and the Linux guest are in the same LAN.

Looking at the ethereal traces captured on the guest. During normal operation 
the keep alive Frames are being sent every 30 secs and the z/Linux guest 
responds to the poll with in 60 micro seconds.  But few times we noticed that 
the frames were being delayed up to 25 seconds (total time from the previous 
poll is 30+25) after the router sends the poll frame to the time the Linux 
guest receives them.  This is causing the keep alive timer ( 9 secs = 1 sec  X 
8 retries) to expire and disconnect sessions.  The Linux guest eventually 
receives the frames including the retires all at the same but by that time the 
sessions are dropped the router. It appears that the frames are being buffered 
and are delayed by the guest receives them.

We for sure know that the router is sending the poll every 30 seconds but some 
were some how the frames were buffered (?) for 25 secs before being delivered 
to the guest.  I am trying to figure at which layer the delay was being 
introduced.  Are there any other traces that I can turn on z/VM to diagnose the 
problem?   Were do I start looking at?

z/VM LPAR is a small one running 8 guest with 80MB memory and 16MB and 48MB 
vdisk on a z10

IFL utilization : 2% X 2 IFLS,
Central Storage  : 95%  768 MB,
XSTORE   : 97%  256MB,
PAGE   : 12% X 2 3390-3 page DASD.

Paging/Spooling activity: 0/s (most of the times)

Thank you,
Ashwin  Bhemidhi



Re: PIPEDDR and attached DASD

2011-05-05 Thread Brian Nielsen
I have solved this problem and present the resolution for the benefit of 

others: it was an MTU mismatch between the VM TCPIP stacks (set at 8992) 

and what the network between would support (1500).

Further testing I did had showed that ATTACH vs. MDISK and the versions o
f 
PIPELINES and PIPEDDR were not the cause.  Instead, sucess or failure was
 
dependent on the data being transferred and whether it traversed the 
external network or not.  After ripping apart PIPEDDR to understand it so
 
that I could add debugging code and options I found that the transfer for
 
a particular disk always failed at a unique track within that disk.  

Manually added pacing (via DELAY stages) would work (although *tediously*
 
more slowly than expected based on the delay value used), but not if the 

delay was too small.  Eventually, while looking for pacing information in
 
the TCPIP stack I read the reference that Selecting an MTU size that is 

too large may cause a client application to hang.  The light bulb went o
n 
and after adjusting the MTU size used in the VM TCPIP stack everything 

works fine now.

Brian Nielsen


On Tue, 5 Apr 2011 12:20:59 -0500, Brian Nielsen bniel...@sco.idaho.gov
 
wrote:

Should PIPEDDR work with attached DASD or does it only support MDISKs? 
 
The documentation doesn't seem to say.  I get an error with attached DAS
D 
but it works fine with a full pack MDISK of the same DASD volume when 

doing a dump/restore over TCPIP.

Using attached DASD will avoid label conflicts and also avoids 
maintaining 
hardcoded MDISKs with DEVNO's in the directory.  Unfortunately, the 
DEFINE 
MDISK command doesn't have a DEVNO option.


Here is what the failure looks like from the receiving and sending sides
 
using attached DASD at both ends:

-

pipeddr restore * 6930 11000 (listen noprompt
Connecting to TCP/IP.  Enter PIPMOD STOP to terminate.
Waiting for connection on port 11000 to restore BNIELSEN 6930.
Sending user is BNIELSEN at VMP2
Receiving data from 172.16.64.45
PIPTCQ1015E ERRNO 54: ECONNRESET.
PIPMSG004I ... Issued from stage 3 of pipeline 3 name iprestore.
PIPMSG001I ... Running tcpdata.
PIPUPK072E Last record not complete.
PIPMSG003I ... Issued from stage 2 of pipeline 1.
PIPMSG001I ... Running unpack.
Data restore failed.
Ready(01015); T=0.01/0.01 08:58:15



pipeddr dump * 9d5e 172.16.64.44 11000
Dumping disk BNIELSEN 9D5E to 172.16.64.44
PIPTCQ1015E ERRNO 32: EPIPE.
PIPMSG004I ... Issued from stage 7 of pipeline 1 name ipread.
PIPMSG001I ... Running tcpclient 172.16.64.44 11000 linger 10 reuseaddr
 
U.
Dump failed.
Ready(01015); T=0.02/0.03 07:54:34




If I create full pack MDISKs via DEFINE MDISK (starting at cyl zero) it 

works fine, as shown below.



pipeddr restore * 6930 11000 (listen noprompt
Connecting to TCP/IP.  Enter PIPMOD STOP to terminate.
Waiting for connection on port 11000 to restore BNIELSEN 6930.
Sending user is BNIELSEN at VMP2
Receiving data from 172.16.64.45
41 MB received.
Data restored successfully.
Ready; T=4.04/4.54 09:34:18


---

pipeddr dump * 9d5e 172.16.64.44 11000
Dumping disk BNIELSEN 9D5E to 172.16.64.44
-- All data sent to BNIELSEN AT VMP1 --
41 MB transmitted.
Dump completed.
Ready; T=6.27/8.21 08:30:37



Brian Nielsen

=
===


May 12 Webcast: Bringing You Up to Date with LE for z/VSE

2011-05-05 Thread Pamela Christina in Sunny Endicott NY (Yes!)
Cross-posted to IBMVM,LINUX390, IBMMAIN for those who are
interested in updates via hour-long, no-charge webcasts.

Feel free to register to listen to one of the two live
calls on Thurs. May 12 or listen to the automatic replay in about
a week.
(remember there's also a linux webcast on May 10/11, too)

http://www.vm.ibm.com/education/lvc/

Title:
Bringing You Up to Date with LE for z/VSE

Abstract:
This webcast will provide an overview of Language Environment for z/VSE,
recap of features, programming aspects, help for debugging, tooling plus
recent functional enhancements with z/VSE 4.3 such as PL/I Multitasking,
LE TCP/IP Multiplexer, added dynamic call capabilities and more.

Speaker:
Wolfgang Bosch, IBM Boeblingen - z/VSE Development  Service,
   Language Environment for z/VSE

Webcast Registration, information, replays at this site:
http://www.vm.ibm.com/education/lvc/


Please direct LVC questions to Julie Liesenfelt at jul...@us.ibm.com


The LVC page has the archive of the past Webcasts.
http://www.vm.ibm.com/education/lvc/
And also for your convenience, you can find them in their own
section below the current year events on the calendar page:
 http://www.vm.ibm.com/events/#2011W

Regards, Pam C


Re: Problem with z/Linux guest Ethernet frames (buffering ?)

2011-05-05 Thread Alan Altmark
On Thursday, 05/05/2011 at 03:22 EDT, Bhemidhi, Ashwin ashw...@ti.com 
wrote:
 Right now there is the eligible list is 0. I could not check the queues 
at the 
 time of event, the 3 times it occurred was either during after business 
hours 
 or over the weekends. By the time I was able to logon the eligible list 
was 0.

You can write an exec that issues IND QUEUES every, say, 10 seconds.  If 
the result shows no-zero eligible lists, record the results (with the 
time) in a file or on the (spooled) console.  Start it when you leave for 
the day.

 I did change the SRM storbuff setting from the default to 300%, 300%, 
200%.

Why?  If you put it back to the defaults, does the problem go away?

Alan Altmark

z/VM and Linux on System z Consultant
IBM System Lab Services and Training 
ibm.com/systems/services/labservices 
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott