Re: Synchronous option for chccwdev -- was there a resolution?
On Tue, 31 Jul 2012, David Boyes wrote: > Florian's original post. > Corroborating posts from other users (Mark Post, etc) > My data (on average 3 out of 100 tests fail) > > I'd be happy to send you more examples. Are you looking for something > specific? The script recently posted here (by you, I think) can generate as > much failure data as you like. > > To be clear, dasdfmt doesn't complain about other users, it fails because > there's no device for it to operate on (yet). Inserting a wait of a few > (variable between 1 and 30 seconds, depending on load) seconds reduces, but > does not eliminate, the failures. Introducing a 60-90 second wait produces a > fairly reliable operation, but still not 100%. Given the need for a reliable > test for use in automation and/or the number of devices that commonly need to > be processed to create large LVM collections, a minute and a half wait just > because we can't reliably depend on chccwdev to be atomic isn't acceptable. Hm, ok. I think we are dealing with 3 different types of failures here: * missing error handling in scripts * failing to ensure exclusive usage * most of the tools needed to activate a device require exclusive usage of the device * most of the tools needed to activate a device trigger additional uevents which would lead udev to check this device out so instead of: chccwdev -e dasdfmt fdasd mkswap chccwdev -d you need to do: chccwdev -e udevadm settle dasdfmt udevadm settle fdasd udevadm settle mkswap udevadm settle chccwdev -d And using the --exit-if-exists option is not enough here - you really need udev to finish using the device. * cases where udev settle is not enough * after udev settle no device node is created * after udev settle udev is still using the device Since this thread is about the last class of failures I'd run a _lot_ of tests over the last couple of days under various system loads to trigger this specific error. I could not find one indication where udev settle did not do its job. However I found 2 possible related bugs: one in CIO where a device is left in an unusable state and one in DASD which could lead to udev using the device after settle returns (but I could not trigger this one). Once I'm done with fixing this bugs I'll look into the distros to find out if the fixes are applicable there and to look for other bugs lurking there. So I suspect that most of the things you observed are results of the 2nd error class (but again I've not looked into the distros yet, maybe the situation is different there). Regards, Sebastian -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
Hello list, Florian wrote about a .1 second sleep time - what a concept! I've never thought to sleep for less than a second. David wrote: > ... Inserting a wait of a few (variable between 1 and 30 seconds, > depending on load) seconds reduces, but does not eliminate, the > failures. Introducing a 60-90 second wait produces a fairly > reliable operation, but still not 100%. ... These two ideas got me to thinking - could we get to an enableDevice() function that is both reliable and fast until/if "chccwdev -e" gets fixed? I reworked that test code and was able to get a few failures on one system, but only needed a millisecond of sleep, if my assumptions are correct (I forget that a millisecond is a long time for a computer). Could someone copy and paste this script and test on a system that fails more regularly? Thanks. Here's the code: # cat testudev # snip below here #!/bin/bash function enableDevice() { chccwdev -e $1 > /dev/null 2>&1 local rc=$? if [ $rc != 0 ]; then # chccwdev failed => try again for seconds in .001 .01 .04 .10 .14 .24 .38 .62 1 2 3 5 8 13 21 34; do echo "chccwdev -e failed; sleeping $seconds seconds" sleep $seconds chccwdev -e $1 > /dev/null 2>&1 rc=$? if [ "$rc" = 0 ]; then # success break # out of for loop fi done fi $udevCmd return $rc } udevCmd="udevadm settle" # udevCmd="udevsettle" vmcp define vfb-512 302 2000 > /dev/null enableDevice 0.0.0302 rc=$? if [ $rc != 0 ]; then echo "return code from enableDevice 0.0.0302 = $rc" fi mkswap /dev/disk/by-path/ccw-0.0.0302-part1 > /dev/null 2>&1 rc=$? if [ $rc != 0 ]; then echo "mkswap failed" else echo "mkswap succeeded" fi chccwdev -d 0.0.0302 > /dev/null rc=$? if [ $rc != 0 ]; then echo "return code from chccwdev -d 0.0.0302 = $rc" fi vmcp det 302 > /dev/null # snip above here Here's the test run: # cat /etc/*release SUSE Linux Enterprise Server 11 (s390x) VERSION = 11 PATCHLEVEL = 1 LSB_VERSION="core-2.0-noarch:core-3.2-noarch:core-4.0-n oarch:core-2.0-s390x:core-3.2-s390x:core-4.0-s390x" # for i in {1..44}; do ./testudev | grep failed; done chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds chccwdev -e failed; sleeping .001 seconds "Mike MacIsaac" -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
> On Sun, 29 Jul 2012, David Boyes wrote: > > Since we have ample data from multiple sources that this DOES NOT > operate reliably, the original question still stands. > > Would you mind sharing some of this ample data? Are this all cases where > dasdfmt complains about other users after "udevadm settle" returned? Florian's original post. Corroborating posts from other users (Mark Post, etc) My data (on average 3 out of 100 tests fail) I'd be happy to send you more examples. Are you looking for something specific? The script recently posted here (by you, I think) can generate as much failure data as you like. To be clear, dasdfmt doesn't complain about other users, it fails because there's no device for it to operate on (yet). Inserting a wait of a few (variable between 1 and 30 seconds, depending on load) seconds reduces, but does not eliminate, the failures. Introducing a 60-90 second wait produces a fairly reliable operation, but still not 100%. Given the need for a reliable test for use in automation and/or the number of devices that commonly need to be processed to create large LVM collections, a minute and a half wait just because we can't reliably depend on chccwdev to be atomic isn't acceptable. I would think it to be a reasonable expectation that 'chccwdev' would not exit until the operation requested was tested to be actually complete and ready to use, or at least provide an option to request that behavior. > When this fails do you find messages from udev in /var/log/messages? Only if udev debugging is turned on (at least in my case -- can't really speak for others). I will send you an example offlist. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Sun, 29 Jul 2012, David Boyes wrote: > Since we have ample data from multiple sources that this DOES NOT operate > reliably, the original question still stands. Would you mind sharing some of this ample data? Are this all cases where dasdfmt complains about other users after "udevadm settle" returned? When this fails do you find messages from udev in /var/log/messages? Regards, Sebastian > How can we reliably block until a I/O subsystem operation is fully and > reliably known to be complete? > > Tracing the udevadm process seems to show some /by-uuid processing that is > failing due uuids being the same -- is there something in the udev device > activation that somehow relies on a unique UUID? If so, that would explain > why it sometimes works and sometimes doesn't (if the device you're trying to > activate is on a different physical disk, you'd win here; if it's on the same > physical disk, you'd lose, or at least have udev try alternative code paths > to get a unique device node created and assigned. Using TDISK or VDISK for > the test case would mislead, as at least tdisk could be on different physical > volumes. I can send you a log from the trace if that would help you. > > In any case, I think Mike's question is a good one: if chccwdev needs a > 'udevadm settle' to operate correctly, why isn't it doing it itself? It seems > like we should be able to rely on chccwdev operations being atomic. > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
> On Sunday, 07/29/2012 at 05:14 EDT, David Boyes > > wrote: > > In any case, I think Mike's question is a good one: if chccwdev needs > > a 'udevadm settle' to operate correctly, why isn't it doing it itself? > > It > seems > > like we should be able to rely on chccwdev operations being atomic. > > This seems like the wrong question. Rather, why is the creation of > /dev/x running asynchronously? I can appreciate that there are > asynchronous *user space* things that want to kick off when a device is > added, but creation of the /dev entry? Well, if we punt to udev to manage the device creation, that's the nature of udev as designed -- to be async (to avoid some messy kernel-space stuff that tended to hang the machine if things didn't go right). I doubt that any of us will significantly influence the direction/design of udev. Chccwdev is completely under IBM control, though -- it'll be a lot easier to change that than udev. > Is this an issue on other platforms, > too? Probably, although I would bet no other platform does as many "hardware" device move/add/changes as 390x does, so it hasn't really been noticed. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Sunday, 07/29/2012 at 05:14 EDT, David Boyes wrote: > In any case, I think Mike's question is a good one: if chccwdev needs a > 'udevadm settle' to operate correctly, why isn't it doing it itself? It seems > like we should be able to rely on chccwdev operations being atomic. This seems like the wrong question. Rather, why is the creation of /dev/x running asynchronously? I can appreciate that there are asynchronous *user space* things that want to kick off when a device is added, but creation of the /dev entry? I mean, how is fstab able to do its job if the devices referenced therein might, in theory, not be defined, yet? Is this an issue on other platforms, too? Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
> Since dasdfmt does the low-level formating stuff it tries to make sure it's > the > only user of the device. But in your case it looks like sometimes it's not the > only user and it's likely that's because some worker of udev is not finished > and still has a file descriptor to this device node opened. > > So I still think it is sufficient to do: > chccwdev -e xxx ;udevadm settle ;dasdfmt xxx Since we have ample data from multiple sources that this DOES NOT operate reliably, the original question still stands. How can we reliably block until a I/O subsystem operation is fully and reliably known to be complete? Tracing the udevadm process seems to show some /by-uuid processing that is failing due uuids being the same -- is there something in the udev device activation that somehow relies on a unique UUID? If so, that would explain why it sometimes works and sometimes doesn't (if the device you're trying to activate is on a different physical disk, you'd win here; if it's on the same physical disk, you'd lose, or at least have udev try alternative code paths to get a unique device node created and assigned. Using TDISK or VDISK for the test case would mislead, as at least tdisk could be on different physical volumes. I can send you a log from the trace if that would help you. In any case, I think Mike's question is a good one: if chccwdev needs a 'udevadm settle' to operate correctly, why isn't it doing it itself? It seems like we should be able to rely on chccwdev operations being atomic. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Sat, 28 Jul 2012, Michael MacIsaac wrote: > function enableDevice { chccwdev -e $1; udevadm settle; } > > and always call that function instead chccwdev -e. So my question is > still: "If a udevadm settle is always required after a chccwdev -e, then > why is it not just built into the command?" Since a) it depends on the type of the device and b) we would have a dependancy on udevadm. Regards, Sebastian -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
Sebastian, > So I still think it is sufficient to do: > chccwdev -e xxx ;udevadm settle ;dasdfmt xxx ... which is somewhat the conclusion I came to with the previous test script. So everyone wanting to script with chccwdev -e could write a function such as: function enableDevice { chccwdev -e $1; udevadm settle; } and always call that function instead chccwdev -e. So my question is still: "If a udevadm settle is always required after a chccwdev -e, then why is it not just built into the command?" "Mike MacIsaac" -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Fri, 27 Jul 2012, Florian Bilek wrote: > I can confirm that the udev settle is returning always with zero. I have > the timeout set to even to 60 sec and an exit if the device node is > available. And that is stlll not enough because udev exits. Without exit I > had set the timeout to 30 secs. Only running the loop two times the chance > gets high to succeed. > > As I have written in my first mail, I encountered that problem already long > time ago. I found always workarounds but with every kernel update the > chance is there that the race condition is coming back. > > In case it seems that there isn't a reliable check that the device is > really useable. It is always a gamble if your procedure succeeds or not. I found the other mail thread mentioned here and have an assumption of what went wrong. I blame the --exit-if-exists option of udev settle (which should be ok in most cases but is not if you want to use dasdfmt afterwards). For the sake of argument let's assume that using udev settle like that would be the same as: if [ ! -e /dev/dasdx ] ;then udevadm settle fi So sometimes you just wait for udev calling mknod but you don't wait for udev finishing the other stuff it does with this device. Since dasdfmt does the low-level formating stuff it tries to make sure it's the only user of the device. But in your case it looks like sometimes it's not the only user and it's likely that's because some worker of udev is not finished and still has a file descriptor to this device node opened. So I still think it is sufficient to do: chccwdev -e xxx ;udevadm settle ;dasdfmt xxx Regards, Sebastian > > Since my work on the clone procedure I see that the critical path is the > amount of steps necessary to make one device useable: > > 1. attach it (vmcp) > 2. vary it online (chccwdev -e) > 3. format it (dasdfmt) > 4. partition it (fdasd) > 5. make the file system > > The most critical steps are 2 and 3. I see it in my exec that most times > both steps are failing and need to be rerun. This is happening with the > actual kernel version on SLES 11 SP2. SP1 had usually only one of both > steps failing. > > On SLES 10 the problem was that the partition node didn't show up or after > a certain amount of (successful) chccwdevs the kernel could not bring the > device online any more and an reboot of the guest was required. > > sync or other tricks I know do not really solve the problem and also udev > is disappointing since it tells that every this ins save which is not the > case. I used dasd_config from SLES 11 but it didn't solve the problem > either. > > The chances are high that between one of these steps the situation arises. > Having the 30 seconds fixed delay between all the steps makes the process > including formatting and creation of the filesystem quite long. Personally > I would say unacceptable long. > > We use here an IBM z/10 with DS 8700 for the disks. z/VM is 6.2 on latest > RSU So it is original and fast equipment and there the situation still > appears. > > Kind regards, > Florian > > > > > > On Fri, Jul 27, 2012 at 8:23 PM, David Boyes wrote: > > > > I believe that's the piece that's missing (for most people). I can > > easily > > > reproduce the problem on my SLES11 SP2 system with this script: > > > vmcp define vfb-512 302 2000 > > > date +%H:%M:%S.%N > > > chccwdev -e 0.0.0302 > > > mkswap /dev/disk/by-path/ccw-0.0.0302-part1 > > > > Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm > > settle' after the 'chccwdev -e', I still get a failure about 3 times out of > > 100 attempts, though. > > > > Alan may be on to something with the timeout value for udev for that type > > of device. > > > > -- > > For LINUX-390 subscribe / signoff / archive access instructions, > > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > > visit > > http://www.marist.edu/htbin/wlvindex?LINUX-390 > > -- > > For more information on Linux on System z, visit > > http://wiki.linuxvm.org/ > > > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution ?
Point Taken, BUT, I might add, this worked on a Z9 where we have about 40 instances on a single LPAR, and one of those was a JAVA resource hog from out state tax commission. Still only have about 200 more to set up. Ben Duncan - Business Network Solutions, Inc. 336 Elton Road Jackson MS, 39212 "Never attribute to malice, that which can be adequately explained by stupidity" - Hanlon's Razor > Original Message -------- > Subject: Re: Synchronous option for chccwdev -- was there a resolution > ? > From: David Boyes > Date: Fri, July 27, 2012 11:17 am > To: LINUX-390@VM.MARIST.EDU > > > > We have a set of scripts that Setup the WHOLE multipath SAN disk for us. > > > From the chccwdev, zfcp_*, to fdisk > > > and format and multipath setup and mount (Yes the WHOLE thing). We > > > have found by placing sleep for 15 seconds between commands , especially > > > the hardware level ones, increased our reliability for success. > > > > Yeah, but the timing for "reliably" doing that appears to depend a lot on > what else is going on -- could be 1 second or 5 or 15 -- and it's inefficient > to sit there and wait or poll if you don't have to. That's why I'm looking > for a reliable "yes, you can proceed, it's ready to go" indicator. > > > > Leaving the default as the current somewhat async behavior is OK, but I'd > really like a "block until you're absolutely sure the action is complete and > functioning" option. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
>>> On 7/27/2012 at 01:26 PM, Michael MacIsaac wrote: > Nice test case! I modified it a bit :)) I have two SLES10 SP4 systems. One is on a fairly loaded box, and one is on a fairly idle box. On the loaded box, the failure rate of the chccwdev -e command was fairly high, even with a 2 second sleep between the vmcp define command and the chccwdev. On the fairly idle box, I never saw a chccwdev -e failure (but I did get one chccwdev -d failure). In both cases, I iterated over the script 1000 time, with the following results Idle SLES10 SP4: 492 cases with udevsettle - 0 failures = 100% successes 507 cases without udevsettle - 506 failures = 99.8% failures Busy SLES10 SP4: 154 chccwdev -e failures 1 chccwdev -d failure = 15.5% total failures. Note that the chccwdev -d command _always_ follows a udevsettle command. 413 cases with udevsettle - 0 failures = 100% successes 433 cases without udevsettle - 423 failures = 97.7% failures In one case, the chccwdev -e failure was temporary. In all other 153 cases, the entry in /sys/bus/ccw/devices/ was not created, even after 3 seconds of waiting. The message from chccwdev -e was "0.0.0302 is not a channel device." That's while running the script in a loop. If I run the script manually each time, I tend to see a couple of different failures, although much less frequently. The most common is the entry in /sys/bus/ccw/devices/ does show up in a few seconds (3 or less). Presumably, this case is taken care of by udevsettle. The next most common (which is far less common than the first failure mode) is that the entry in /sys/bus/ccw/devices/ is never created, nor is the entry in /sys/devices/ccs0/. In some rare cases, when the chccwdev -d command is issued, the entry in /sys/devices/css0/0.0./ is removed, but the /sys/bus/ccw/devices/0.0.0302/ is not, leading to a broken symbolic link. If I redefine the device, I can use it again, but disabling it and detaching it leaves the danglnig symlink. The only thing that seems to clear that up is a reboot. I don't know if leaving it alone would cause any problems further on down the road or not. Idle SLES11 SP2 502 cases with udevadm settle - 0 failures, 100% successes 498 cases without udevadm settle 23 chccwdev -e failures = 11.5% 97 udevsettle cases = 100% success 80 no udevsettle = 100% failure. Given the difference in results between a busy and idle SLES10 SP4 system, and the fact that I don't have a SLES11 SP2 guest on a busy system, David's rate of about 3% failures with udevadm settle can't be ignored. I doubt very much it's the udevadm settle timeout value. The default is 180 seconds for _everything_. I think at this point, I need to state the obvious: if a customer (or business partner) experiencing this problem has a support contract with SUSE, Red Hat, or IBM and opens up a support request, there is likely to be more effort into figuring out a fix. Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
Dear all, I can confirm that the udev settle is returning always with zero. I have the timeout set to even to 60 sec and an exit if the device node is available. And that is stlll not enough because udev exits. Without exit I had set the timeout to 30 secs. Only running the loop two times the chance gets high to succeed. As I have written in my first mail, I encountered that problem already long time ago. I found always workarounds but with every kernel update the chance is there that the race condition is coming back. In case it seems that there isn't a reliable check that the device is really useable. It is always a gamble if your procedure succeeds or not. Since my work on the clone procedure I see that the critical path is the amount of steps necessary to make one device useable: 1. attach it (vmcp) 2. vary it online (chccwdev -e) 3. format it (dasdfmt) 4. partition it (fdasd) 5. make the file system The most critical steps are 2 and 3. I see it in my exec that most times both steps are failing and need to be rerun. This is happening with the actual kernel version on SLES 11 SP2. SP1 had usually only one of both steps failing. On SLES 10 the problem was that the partition node didn't show up or after a certain amount of (successful) chccwdevs the kernel could not bring the device online any more and an reboot of the guest was required. sync or other tricks I know do not really solve the problem and also udev is disappointing since it tells that every this ins save which is not the case. I used dasd_config from SLES 11 but it didn't solve the problem either. The chances are high that between one of these steps the situation arises. Having the 30 seconds fixed delay between all the steps makes the process including formatting and creation of the filesystem quite long. Personally I would say unacceptable long. We use here an IBM z/10 with DS 8700 for the disks. z/VM is 6.2 on latest RSU So it is original and fast equipment and there the situation still appears. Kind regards, Florian On Fri, Jul 27, 2012 at 8:23 PM, David Boyes wrote: > > I believe that's the piece that's missing (for most people). I can > easily > > reproduce the problem on my SLES11 SP2 system with this script: > > vmcp define vfb-512 302 2000 > > date +%H:%M:%S.%N > > chccwdev -e 0.0.0302 > > mkswap /dev/disk/by-path/ccw-0.0.0302-part1 > > Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm > settle' after the 'chccwdev -e', I still get a failure about 3 times out of > 100 attempts, though. > > Alan may be on to something with the timeout value for udev for that type > of device. > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
> I believe that's the piece that's missing (for most people). I can easily > reproduce the problem on my SLES11 SP2 system with this script: > vmcp define vfb-512 302 2000 > date +%H:%M:%S.%N > chccwdev -e 0.0.0302 > mkswap /dev/disk/by-path/ccw-0.0.0302-part1 Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm settle' after the 'chccwdev -e', I still get a failure about 3 times out of 100 attempts, though. Alan may be on to something with the timeout value for udev for that type of device. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
>>> On 7/27/2012 at 01:26 PM, Michael MacIsaac wrote: > I never got the chccwdev to fail. If you did the test on SLES11 or later, I don't think that command will fail since it uses /proc/cio_settle. Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
Mark, > I can easily reproduce the problem on my SLES11 SP2 system with this script: > ... Nice test case! I modified it a bit :)) I never got the chccwdev to fail. I did see the mkswap fail regularly. Then I randomly added a "udevadm settle" after the chccwdev -e. Every time the udevadm settle kicks in, the mkswap works! So maybe the solution is as easy as adding a udevadm settle after the chccwdev -e? Here's the test run: # for i in {1..10} > do > testudev > done seed is small - add udevadm settle SUCCESS seed is large FAILURE seed is large FAILURE seed is small - add udevadm settle SUCCESS seed is small - add udevadm settle SUCCESS seed is large FAILURE seed is large FAILURE seed is large FAILURE seed is large FAILURE seed is large FAILURE Here's the code: # cat testudev #!/bin/bash let seed=$RANDOM if [ $seed -lt 16384 ]; then # add a udevsettle echo "seed is small - add udevadm settle" else echo "seed is large" fi vmcp define vfb-512 302 2000 > /dev/null chccwdev -e 0.0.0302 > /dev/null rc=$? if [ $rc != 0 ]; then echo "return code from chccwdev -e 0.0.0302 = $rc" fi if [ $seed -lt 16384 ]; then # add a udevsettle udevadm settle fi mkswap /dev/disk/by-path/ccw-0.0.0302-part1 > /dev/null 2>&1 rc=$? if [ $rc != 0 ]; then echo "FAILURE" else echo "SUCCESS" fi chccwdev -d 0.0.0302 > /dev/null rc=$? if [ $rc != 0 ]; then echo "return code from chccwdev -d 0.0.0302 = $rc" fi vmcp det 302 > /dev/null "Mike MacIsaac" -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Friday, 07/27/2012 at 11:33 EDT, Sebastian Ott wrote: > The process of setting the device online involves generic path > verification work done by the Common IO Layer and device specific > online processing done by the device driver (DASD in this case). Once > the DASD driver finished its work and created a block device, userspace > is informed about this via uevents. After that chccwdev returns. The > only thing that's missing now is udev creating a device node and that's > covered via udev settle. Is it possible that the udev settle timeout is set to zero, preventing it from waiting? (I see that you can set up a udev debug log to see what udev is doing.) Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution ?
> We have a set of scripts that Setup the WHOLE multipath SAN disk for us. > From the chccwdev, zfcp_*, to fdisk > and format and multipath setup and mount (Yes the WHOLE thing). We > have found by placing sleep for 15 seconds between commands , especially > the hardware level ones, increased our reliability for success. Yeah, but the timing for "reliably" doing that appears to depend a lot on what else is going on -- could be 1 second or 5 or 15 -- and it's inefficient to sit there and wait or poll if you don't have to. That's why I'm looking for a reliable "yes, you can proceed, it's ready to go" indicator. Leaving the default as the current somewhat async behavior is OK, but I'd really like a "block until you're absolutely sure the action is complete and functioning" option.
Re: Synchronous option for chccwdev -- was there a resolution?
>>> On 7/27/2012 at 11:15 AM, Sebastian Ott wrote: > Once > the DASD driver finished its work and created a block device, userspace > is informed about this via uevents. After that chccwdev returns. The > only thing that's missing now is udev creating a device node and that's > covered via udev settle. I believe that's the piece that's missing (for most people). I can easily reproduce the problem on my SLES11 SP2 system with this script: vmcp define vfb-512 302 2000 date +%H:%M:%S.%N chccwdev -e 0.0.0302 mkswap /dev/disk/by-path/ccw-0.0.0302-part1 date +%H:%M:%S.%N udevadm settle chccwdev -d 0.0.0302 vmcp det 302 If fails almost every time. (And if I leave the udevadm settle command out before the chccwdev -d command, that will usually fail also.) If I add a udevadm settle just after the chccwdev -e, it works. Since my system is not heavily loaded, I can't be sure that it will work 100% of the time, but it certainly does a better job than without it. For my SLES10 system, I had to use the udevsettle command, of course. Our dasd_configure script uses udevsettle/udevadm settle for bringing volumes on and offline, and it seems to work fine as well. Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
>Once the DASD driver finished > its work and created a block device, userspace is informed about this via > uevents. After that chccwdev returns. The only thing that's missing now is > udev creating a device node and that's covered via udev settle. Thanks for the walkthrough. The problem appears to be somewhere after the exit from udev settle -- the device is not always actually ready and available for use when udev settle exits. We're seeing the problem on RHEL 6 guests on varying hardware (zPDT, older Sharks, etc) -- basically slower hardware that can't necessarily respond instantly to requests. I don't know what Florian has, but that seems to be characteristic of what we're seeing when it fails. >So unless I'm missing something the described method > should work reliable on current distros. (In a real world scenario you have to > check return codes of the previous steps before executing the next one.) Yeah. We're getting rc=0 for the chccwdev and udev settle, which is what's kinda weird about it. I'll see if we can convince udev to wait a bit longer on the settle. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution ?
We have a set of scripts that Setup the WHOLE multipath SAN disk for us. >From the chccwdev, zfcp_*, to fdisk and format and multipath setup and mount (Yes the WHOLE thing). We have found by placing sleep for 15 seconds between commands , especially the hardware level ones, increased our reliability for success. Ben Duncan - Business Network Solutions, Inc. 336 Elton Road Jackson MS, 39212 "Never attribute to malice, that which can be adequately explained by stupidity" - Hanlon's Razor > Original Message ---- > Subject: Re: Synchronous option for chccwdev -- was there a resolution? > From: David Boyes > Date: Fri, July 27, 2012 9:05 am > To: LINUX-390@VM.MARIST.EDU > > > > On which distro do you have problems with chccwdev? > > [snip] > > ..which appears to work fine. > > It's not a distribution issue; it's a timing-dependent issue that has to do > with how quickly your hardware responds. Your script will work *most* of the > time -- except when it doesn't. > Easy way to reproduce the problem is to try your script on a zPDT or a > heavily loaded system where the response to requests may not be immediate. > The udev settle command isn't a reliable indicator that the device is > available. > > I'm looking for a reliable method that works all the time. > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Fri, 27 Jul 2012, David Boyes wrote: > > On which distro do you have problems with chccwdev? > > [snip] > > ..which appears to work fine. > > It's not a distribution issue; it's a timing-dependent issue that has to do > with how quickly your hardware responds. Your script will work *most* of the > time -- except when it doesn't. > Easy way to reproduce the problem is to try your script on a zPDT or a > heavily loaded system where the response to requests may not be immediate. > The udev settle command isn't a reliable indicator that the device is > available. > > I'm looking for a reliable method that works all the time. OK. We have 3 userspace triggered actions here: 1) make the device available to linux, e.g. via vmcp define 2) set the device online via chccwdev 3) actually use the device via its device node After 1) and before we could do 2) we have to make sure that Linux would receive a machine check and that the device recognition steps done by the Common IO Layer (triggered by the machine check) are finished. Both is achieved by cio_settle which is invoked by chccwdev (on current distros) before the online setting starts. The process of setting the device online involves generic path verification work done by the Common IO Layer and device specific online processing done by the device driver (DASD in this case). Once the DASD driver finished its work and created a block device, userspace is informed about this via uevents. After that chccwdev returns. The only thing that's missing now is udev creating a device node and that's covered via udev settle. I'm certainly no expert of the DASD driver but I can't see a race window or timing issue here. So unless I'm missing something the described method should work reliable on current distros. (In a real world scenario you have to check return codes of the previous steps before executing the next one.) Regards, Sebstian > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
> On which distro do you have problems with chccwdev? > [snip] > ..which appears to work fine. It's not a distribution issue; it's a timing-dependent issue that has to do with how quickly your hardware responds. Your script will work *most* of the time -- except when it doesn't. Easy way to reproduce the problem is to try your script on a zPDT or a heavily loaded system where the response to requests may not be immediate. The udev settle command isn't a reliable indicator that the device is available. I'm looking for a reliable method that works all the time. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
On Thu, 26 Jul 2012, David Boyes wrote: > A week or two back, someone (I think it was Florian Bilek) asked why there > was a delay between invoking chccwdev and the device becoming available, and > whether there was an option or command that would exit only when the device > was actually available. There was some discussion of the --settle option in > udev, but I don't recall seeing a resolution other than "loop on (test for > device availability;sleep a few seconds) repeat". > > Was there a better solution? If not, could the IBM developers add a --sync > option to chccwdev that forces chccwdev to wait until the requested operation > is actually completed before exiting? On which distro do you have problems with chccwdev? I just did a quick test: for i in {1..100} ;do vmcp def t3390 as 1234 100 ; chccwdev -e 1234 ; \ udevadm settle; dasdfmt -b 4096 -y /dev/dasde ; chccwdev -d 1234 ;\ vmcp det 1234 ;done ..which appears to work fine. Regards, Sebastian > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
Is there a known reliable workaround? I have tried running dasdfmt in a script right after chccwdev and run into the non-existent device problem. So I tried putting a loop after chccwdev waiting for the device to appear in /dev, like this: while [[ ! -b $dev ]] ; do sleep 0.1 done and then ran dasdfmt. That works most of the time, but still isn't reliable enough, especially on slow systems I get the occasional "DASD format failed: dasdfmt: (format cylinder) IOCTL BIODASDFMT failed. (Input/output error)" I know it is possible to rerun the dasdfmt, or whatever command follows the chccwdev, but it would be much nicer to have some indicator that the device is really ready. Then I could just write a wrapper around chccwdev and forget about this problem. Thanks, Tomas -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Florian Bilek Sent: Friday, July 27, 2012 6:28 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: Synchronous option for chccwdev -- was there a resolution? Hi David, Thank you for bringing up this topic again. No, unfortunately there was no other solution than to rerun the commands. I think there should be an option for chccwdev to wait till DE/CE is received and not to terminate with device busy. Kind regards, Florian On Thu, Jul 26, 2012 at 6:52 PM, David Boyes wrote: > A week or two back, someone (I think it was Florian Bilek) asked why > there was a delay between invoking chccwdev and the device becoming > available, and whether there was an option or command that would exit > only when the device was actually available. There was some discussion > of the --settle option in udev, but I don't recall seeing a resolution > other than "loop on (test for device availability;sleep a few seconds) > repeat". > > Was there a better solution? If not, could the IBM developers add a > --sync option to chccwdev that forces chccwdev to wait until the > requested operation is actually completed before exiting? > > -- > For LINUX-390 subscribe / signoff / archive access instructions, send > email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Synchronous option for chccwdev -- was there a resolution?
Hi David, Thank you for bringing up this topic again. No, unfortunately there was no other solution than to rerun the commands. I think there should be an option for chccwdev to wait till DE/CE is received and not to terminate with device busy. Kind regards, Florian On Thu, Jul 26, 2012 at 6:52 PM, David Boyes wrote: > A week or two back, someone (I think it was Florian Bilek) asked why there > was a delay between invoking chccwdev and the device becoming available, > and whether there was an option or command that would exit only when the > device was actually available. There was some discussion of the --settle > option in udev, but I don't recall seeing a resolution other than "loop on > (test for device availability;sleep a few seconds) repeat". > > Was there a better solution? If not, could the IBM developers add a --sync > option to chccwdev that forces chccwdev to wait until the requested > operation is actually completed before exiting? > > -- > For LINUX-390 subscribe / signoff / archive access instructions, > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or > visit > http://www.marist.edu/htbin/wlvindex?LINUX-390 > -- > For more information on Linux on System z, visit > http://wiki.linuxvm.org/ > -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/