Re: Synchronous option for chccwdev -- was there a resolution?

2012-08-02 Thread Sebastian Ott
On Tue, 31 Jul 2012, David Boyes wrote:
> Florian's original post.
> Corroborating posts from other users (Mark Post, etc)
> My data (on average 3 out of 100 tests fail)
>
> I'd be happy to send you more examples. Are you looking for something 
> specific? The script recently posted here (by you, I think) can generate as 
> much failure data as you like.
>
> To be clear, dasdfmt doesn't complain about other users, it fails because 
> there's no device for it to operate on (yet). Inserting a wait of a few 
> (variable between 1 and 30 seconds, depending on load) seconds reduces, but 
> does not eliminate, the failures. Introducing a 60-90 second wait produces a 
> fairly reliable operation, but still not 100%. Given the need for a reliable 
> test for use in automation and/or the number of devices that commonly need to 
> be processed to create large LVM collections, a minute and a half wait just 
> because we can't reliably depend on chccwdev to be atomic isn't acceptable.

Hm, ok. I think we are dealing with 3 different types of failures here:

* missing error handling in scripts

* failing to ensure exclusive usage
  * most of the tools needed to activate a device require exclusive usage
of the device
  * most of the tools needed to activate a device trigger additional
uevents which would lead udev to check this device out

so instead of:
chccwdev -e
dasdfmt
fdasd
mkswap
chccwdev -d

you need to do:
chccwdev -e
udevadm settle
dasdfmt
udevadm settle
fdasd
udevadm settle
mkswap
udevadm settle
chccwdev -d

And using the --exit-if-exists option is not enough here - you really need
udev to finish using the device.

* cases where udev settle is not enough
  * after udev settle no device node is created
  * after udev settle udev is still using the device

Since this thread is about the last class of failures I'd run a _lot_ of
tests over the last couple of days under various system loads to trigger
this specific error. I could not find one indication where udev settle did
not do its job.
However I found 2 possible related bugs: one in CIO where a device is left
in an unusable state and one in DASD which could lead to udev using the
device after settle returns (but I could not trigger this one).

Once I'm done with fixing this bugs I'll look into the distros to find out
if the fixes are applicable there and to look for other bugs lurking
there.

So I suspect that most of the things you observed are results of the 2nd
error class (but again I've not looked into the distros yet, maybe the
situation is different there).

Regards,
Sebastian

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-31 Thread Michael MacIsaac
Hello list,

Florian wrote about a .1 second sleep time - what a concept!  I've never
thought to sleep for less than a second.

David wrote:
> ... Inserting a wait of a few (variable between 1 and 30 seconds,
> depending on load) seconds reduces, but does not eliminate, the
> failures. Introducing a 60-90 second wait produces a fairly
> reliable operation, but still not 100%. ...

These two ideas got me to thinking - could we get to an enableDevice()
function that is both reliable and fast until/if "chccwdev -e" gets fixed?

I reworked that test code and was able to get a few failures on one
system, but only needed a millisecond of sleep, if my assumptions are
correct (I forget that a millisecond is a long time for a computer).

Could someone copy and paste this script and test on a system that fails
more regularly?  Thanks.


Here's the code:

# cat testudev  # snip below here
#!/bin/bash
function enableDevice()
{
  chccwdev -e $1 > /dev/null 2>&1
  local rc=$?
  if [ $rc != 0 ]; then # chccwdev failed => try again
for seconds in .001 .01 .04 .10 .14 .24 .38 .62 1 2 3 5 8 13 21 34; do
  echo "chccwdev -e failed; sleeping $seconds seconds"
  sleep $seconds
  chccwdev -e $1 > /dev/null 2>&1
  rc=$?
  if [ "$rc" = 0 ]; then # success
break # out of for loop
  fi
done
  fi
  $udevCmd
  return $rc
}

udevCmd="udevadm settle"
# udevCmd="udevsettle"
vmcp define vfb-512 302 2000 > /dev/null
enableDevice 0.0.0302
rc=$?
if [ $rc != 0 ]; then
  echo "return code from enableDevice 0.0.0302 = $rc"
fi
mkswap /dev/disk/by-path/ccw-0.0.0302-part1 > /dev/null 2>&1
rc=$?
if [ $rc != 0 ]; then
  echo "mkswap failed"
else
  echo "mkswap succeeded"
fi
chccwdev -d 0.0.0302 > /dev/null
rc=$?
if [ $rc != 0 ]; then
  echo "return code from chccwdev -d 0.0.0302 = $rc"
fi
vmcp det 302 > /dev/null
# snip above here


Here's the test run:

# cat /etc/*release
SUSE Linux Enterprise Server 11 (s390x)
VERSION = 11
PATCHLEVEL = 1
LSB_VERSION="core-2.0-noarch:core-3.2-noarch:core-4.0-n
oarch:core-2.0-s390x:core-3.2-s390x:core-4.0-s390x"
# for i in {1..44}; do ./testudev | grep failed; done
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds
chccwdev -e failed; sleeping .001 seconds


"Mike MacIsaac" 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-31 Thread David Boyes
> On Sun, 29 Jul 2012, David Boyes wrote:
> > Since we have ample data from multiple sources that this DOES NOT
> operate reliably, the original question still stands.
> 
> Would you mind sharing some of this ample data? Are this all cases where
> dasdfmt complains about other users after "udevadm settle" returned?

Florian's original post.
Corroborating posts from other users (Mark Post, etc)
My data (on average 3 out of 100 tests fail)

I'd be happy to send you more examples. Are you looking for something specific? 
The script recently posted here (by you, I think) can generate as much failure 
data as you like. 

To be clear, dasdfmt doesn't complain about other users, it fails because 
there's no device for it to operate on (yet). Inserting a wait of a few 
(variable between 1 and 30 seconds, depending on load) seconds reduces, but 
does not eliminate, the failures. Introducing a 60-90 second wait produces a 
fairly reliable operation, but still not 100%. Given the need for a reliable 
test for use in automation and/or the number of devices that commonly need to 
be processed to create large LVM collections, a minute and a half wait just 
because we can't reliably depend on chccwdev to be atomic isn't acceptable. 

I would think it to be a reasonable expectation that 'chccwdev' would not exit 
until the operation requested was tested to be actually complete and ready to 
use, or at least provide an option to request that behavior.  

> When this fails do you find messages from udev in /var/log/messages?

Only if udev debugging is turned on (at least in my case -- can't really speak 
for others). I will send you an example offlist. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-31 Thread Sebastian Ott
On Sun, 29 Jul 2012, David Boyes wrote:
> Since we have ample data from multiple sources that this DOES NOT operate 
> reliably, the original question still stands.

Would you mind sharing some of this ample data? Are this all cases where
dasdfmt complains about other users after "udevadm settle" returned?
When this fails do you find messages from udev in /var/log/messages?

Regards,
Sebastian

> How can we reliably block until a I/O subsystem operation is fully and 
> reliably known to be complete?
>
> Tracing the udevadm process seems to show some /by-uuid processing that is 
> failing due uuids being the same -- is there something in the udev device 
> activation that somehow relies on a unique UUID? If so, that would explain 
> why it sometimes works and sometimes doesn't (if the device you're trying to 
> activate is on a different physical disk, you'd win here; if it's on the same 
> physical disk, you'd lose, or at least have udev try alternative code paths 
> to get a unique device node created and assigned. Using TDISK or VDISK for 
> the test case would mislead, as at least tdisk could be on different physical 
> volumes.  I can send you a log from the trace if that would help you.
>
> In any case, I think Mike's question is a good one: if chccwdev needs a 
> 'udevadm settle' to operate correctly, why isn't it doing it itself? It seems 
> like we should be able to rely on chccwdev operations being atomic.
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-29 Thread David Boyes
> On Sunday, 07/29/2012 at 05:14 EDT, David Boyes
> 
> wrote:
> > In any case, I think Mike's question is a good one: if chccwdev needs
> > a 'udevadm settle' to operate correctly, why isn't it doing it itself?
> > It
> seems
> > like we should be able to rely on chccwdev operations being atomic.
> 
> This seems like the wrong question.  Rather, why is the creation of
> /dev/x  running asynchronously?  I can appreciate that there are
> asynchronous *user space* things that want to kick off when a device is
> added, but creation of the /dev entry?

Well, if we punt to udev to manage the device creation, that's the nature of 
udev as designed -- to be async (to avoid some messy kernel-space stuff that 
tended to hang the machine if things didn't go right). I doubt that any of us 
will significantly influence the direction/design of udev. 

Chccwdev is completely under IBM control, though -- it'll be a lot easier to 
change that than udev. 

> Is this an issue on other platforms,
> too?

Probably, although I would bet no other platform does as many "hardware" device 
move/add/changes as 390x does, so it hasn't really been noticed. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-29 Thread Alan Altmark
On Sunday, 07/29/2012 at 05:14 EDT, David Boyes 
wrote:
> In any case, I think Mike's question is a good one: if chccwdev needs a
> 'udevadm settle' to operate correctly, why isn't it doing it itself? It
seems
> like we should be able to rely on chccwdev operations being atomic.

This seems like the wrong question.  Rather, why is the creation of
/dev/x  running asynchronously?  I can appreciate that there are
asynchronous *user space* things that want to kick off when a device is
added, but creation of the /dev entry?

I mean, how is fstab able to do its job if the devices referenced therein
might, in theory, not be defined, yet?  Is this an issue on other
platforms, too?

Alan Altmark

Senior Managing z/VM and Linux Consultant
IBM System Lab Services and Training
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-29 Thread David Boyes
> Since dasdfmt does the low-level formating stuff it tries to make sure it's 
> the
> only user of the device. But in your case it looks like sometimes it's not the
> only user and it's likely that's because some worker of udev is not finished
> and still has a file descriptor to this device node opened.
> 
> So I still think it is sufficient to do:
> chccwdev -e xxx ;udevadm settle ;dasdfmt xxx

Since we have ample data from multiple sources that this DOES NOT operate 
reliably, the original question still stands. 
How can we reliably block until a I/O subsystem operation is fully and reliably 
known to be complete?

Tracing the udevadm process seems to show some /by-uuid processing that is 
failing due uuids being the same -- is there something in the udev device 
activation that somehow relies on a unique UUID? If so, that would explain why 
it sometimes works and sometimes doesn't (if the device you're trying to 
activate is on a different physical disk, you'd win here; if it's on the same 
physical disk, you'd lose, or at least have udev try alternative code paths to 
get a unique device node created and assigned. Using TDISK or VDISK for the 
test case would mislead, as at least tdisk could be on different physical 
volumes.  I can send you a log from the trace if that would help you. 

In any case, I think Mike's question is a good one: if chccwdev needs a 
'udevadm settle' to operate correctly, why isn't it doing it itself? It seems 
like we should be able to rely on chccwdev operations being atomic. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-28 Thread Sebastian Ott
On Sat, 28 Jul 2012, Michael MacIsaac wrote:
>   function enableDevice { chccwdev -e $1; udevadm settle; }
>
> and always call that function instead chccwdev -e. So my question is
> still: "If a udevadm settle is always required after a chccwdev -e, then
> why is it not just built into the command?"
Since a) it depends on the type of the device and b) we would have a
dependancy on udevadm.

Regards,
Sebastian

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-28 Thread Michael MacIsaac
Sebastian,

> So I still think it is sufficient to do:
> chccwdev -e xxx ;udevadm settle ;dasdfmt xxx

... which is somewhat the conclusion I came to with the previous test
script. So everyone wanting to script with chccwdev -e could write a
function such as:

  function enableDevice { chccwdev -e $1; udevadm settle; }

and always call that function instead chccwdev -e. So my question is
still: "If a udevadm settle is always required after a chccwdev -e, then
why is it not just built into the command?"

"Mike MacIsaac" 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-28 Thread Sebastian Ott
On Fri, 27 Jul 2012, Florian Bilek wrote:
> I can confirm that the udev settle is returning always with zero. I have
> the timeout set to even to 60 sec and an exit if the device node is
> available. And that is stlll not enough because udev exits. Without exit I
> had set the timeout to 30 secs. Only running the loop two times the chance
> gets high to succeed.
>
> As I have written in my first mail, I encountered that problem already long
> time ago. I found always workarounds but with every kernel update the
> chance is there that the race condition is coming back.
>
> In case it seems that there isn't a reliable check that the device is
> really useable. It is always a gamble if your procedure succeeds or not.

I found the other mail thread mentioned here and have an assumption of
what went wrong. I blame the --exit-if-exists option of udev settle (which
should be ok in most cases but is not if you want to use dasdfmt
afterwards). For the sake of argument let's assume that using udev settle
like that would be the same as:
if [ ! -e /dev/dasdx ] ;then
  udevadm settle
fi
So sometimes you just wait for udev calling mknod but you don't wait for
udev finishing the other stuff it does with this device.

Since dasdfmt does the low-level formating stuff it tries to make sure
it's the only user of the device. But in your case it looks like
sometimes it's not the only user and it's likely that's because some
worker of udev is not finished and still has a file descriptor to
this device node opened.

So I still think it is sufficient to do:
chccwdev -e xxx ;udevadm settle ;dasdfmt xxx

Regards,
Sebastian

>
> Since my work on the clone procedure I see that the critical path is the
> amount of steps necessary to make one device useable:
>
> 1. attach it (vmcp)
> 2. vary it online (chccwdev -e)
> 3. format it (dasdfmt)
> 4. partition it (fdasd)
> 5. make the file system
>
> The most critical steps are 2 and 3. I see it in my exec that most times
> both steps are failing and need to be rerun. This is happening with the
> actual kernel version on SLES 11 SP2. SP1 had usually only one of both
> steps failing.
>
> On SLES 10 the problem was that the partition node didn't show up or after
> a certain amount of (successful) chccwdevs the kernel could not bring the
> device online any more and an reboot of the guest was required.
>
> sync or other tricks I know do not really solve the problem and also udev
> is disappointing since it tells that every this ins save which is not the
> case. I used dasd_config from SLES 11 but it didn't solve the problem
> either.
>
> The chances are high that between one of these steps the situation arises.
> Having the 30 seconds fixed delay between all the steps makes the process
> including formatting and creation of the filesystem quite long. Personally
> I would say unacceptable long.
>
> We use here an IBM z/10 with DS 8700 for the disks. z/VM is 6.2 on latest
> RSU  So it is original and fast equipment and there the situation still
> appears.
>
> Kind regards,
> Florian
>
>
>
>
>
> On Fri, Jul 27, 2012 at 8:23 PM, David Boyes  wrote:
>
> > > I believe that's the piece that's missing (for most people).  I can
> > easily
> > > reproduce the problem on my SLES11 SP2 system with this script:
> > > vmcp define vfb-512 302 2000
> > > date +%H:%M:%S.%N
> > > chccwdev -e 0.0.0302
> > > mkswap /dev/disk/by-path/ccw-0.0.0302-part1
> >
> > Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm
> > settle' after the 'chccwdev -e', I still get a failure about 3 times out of
> >  100 attempts, though.
> >
> > Alan may be on to something with the timeout value for udev for that type
> > of device.
> >
> > --
> > For LINUX-390 subscribe / signoff / archive access instructions,
> > send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> > visit
> > http://www.marist.edu/htbin/wlvindex?LINUX-390
> > --
> > For more information on Linux on System z, visit
> > http://wiki.linuxvm.org/
> >
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution ?

2012-07-28 Thread Ben Duncan
Point Taken, BUT, I might add, this worked on a Z9 where we have about
40 instances on a single LPAR, and one of those was a JAVA resource hog
from out state tax commission. Still only have about 200 more to set up.

Ben Duncan - Business Network Solutions, Inc. 336 Elton Road  Jackson
MS, 39212
"Never attribute to malice, that which can be adequately explained by
stupidity"
- Hanlon's Razor


>  Original Message --------
> Subject: Re: Synchronous option for chccwdev -- was there a resolution
> ?
> From: David Boyes 
> Date: Fri, July 27, 2012 11:17 am
> To: LINUX-390@VM.MARIST.EDU
> 
> 
> > We have a set of scripts that Setup the WHOLE multipath SAN disk for us.
> 
> > From the chccwdev, zfcp_*, to fdisk
> 
> > and format and multipath setup and mount (Yes the WHOLE thing).  We
> 
> > have found by placing sleep for 15 seconds between commands , especially
> 
> > the hardware level ones, increased our reliability for success.
> 
> 
> 
> Yeah, but the timing for "reliably" doing that appears to depend a lot on 
> what else is going on -- could be 1 second or 5 or 15 -- and it's inefficient 
> to sit there and wait or poll if you don't have to. That's why I'm looking 
> for a reliable "yes, you can proceed, it's ready to go" indicator. 
> 
> 
> 
> Leaving the default as the current somewhat async behavior is OK, but I'd 
> really like a "block until you're absolutely sure the action is complete and 
> functioning" option.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Mark Post
>>> On 7/27/2012 at 01:26 PM, Michael MacIsaac  wrote: 
> Nice test case! I modified it a bit :))

I have two SLES10 SP4 systems.  One is on a fairly loaded box, and one is on a 
fairly idle box.  On the loaded box, the failure rate of the chccwdev -e 
command was fairly high, even with a 2 second sleep between the vmcp define 
command and the chccwdev.  On the fairly idle box, I never saw a chccwdev -e 
failure (but I did get one chccwdev -d failure).  In both cases, I iterated 
over the script 1000 time, with the following results
Idle SLES10 SP4:
492 cases with udevsettle - 0 failures = 100% successes
507 cases without udevsettle - 506 failures = 99.8% failures

Busy SLES10 SP4:
154 chccwdev -e failures
1 chccwdev -d failure = 15.5% total failures.  Note that the chccwdev -d 
command _always_ follows a udevsettle command.
413 cases with udevsettle - 0 failures = 100% successes
433 cases without udevsettle - 423 failures = 97.7% failures

In one case, the chccwdev -e failure was temporary.  In all other 153 cases, 
the entry in /sys/bus/ccw/devices/ was not created, even after 3 seconds of 
waiting.  The message from chccwdev -e was "0.0.0302 is not a channel device."  
That's while running the script in a loop.

If I run the script manually each time, I tend to see a couple of different 
failures, although much less frequently.  The most common is the entry in 
/sys/bus/ccw/devices/ does show up in a few seconds (3 or less).  Presumably, 
this case is taken care of by udevsettle.  The next most common (which is far 
less common than the first failure mode) is that the entry in 
/sys/bus/ccw/devices/ is never created, nor is the entry in /sys/devices/ccs0/. 
 In some rare cases, when the chccwdev -d command is issued, the entry in 
/sys/devices/css0/0.0./ is removed, but the /sys/bus/ccw/devices/0.0.0302/ 
is not, leading to a broken symbolic link.  If I redefine the device, I can use 
it again, but disabling it and detaching it leaves the danglnig symlink.  The 
only thing that seems to clear that up is a reboot.  I don't know if leaving it 
alone would cause any problems further on down the road or not.

Idle SLES11 SP2
502 cases with udevadm settle - 0 failures, 100% successes
498 cases without udevadm settle 
23 chccwdev -e failures = 11.5%
97 udevsettle cases = 100% success
80 no udevsettle = 100% failure.


Given the difference in results between a busy and idle SLES10 SP4 system, and 
the fact that I don't have a SLES11 SP2 guest on a busy system,  David's rate 
of about 3% failures with udevadm settle can't be ignored.

I doubt very much it's the udevadm settle timeout value.  The default is 180 
seconds for _everything_.

I think at this point, I need to state the obvious: if a customer (or business 
partner) experiencing this problem has a support contract with SUSE, Red Hat, 
or IBM and opens up a support request, there is likely to be more effort into 
figuring out a fix.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Florian Bilek
Dear all,

I can confirm that the udev settle is returning always with zero. I have
the timeout set to even to 60 sec and an exit if the device node is
available. And that is stlll not enough because udev exits. Without exit I
had set the timeout to 30 secs. Only running the loop two times the chance
gets high to succeed.

As I have written in my first mail, I encountered that problem already long
time ago. I found always workarounds but with every kernel update the
chance is there that the race condition is coming back.

In case it seems that there isn't a reliable check that the device is
really useable. It is always a gamble if your procedure succeeds or not.

Since my work on the clone procedure I see that the critical path is the
amount of steps necessary to make one device useable:

1. attach it (vmcp)
2. vary it online (chccwdev -e)
3. format it (dasdfmt)
4. partition it (fdasd)
5. make the file system

The most critical steps are 2 and 3. I see it in my exec that most times
both steps are failing and need to be rerun. This is happening with the
actual kernel version on SLES 11 SP2. SP1 had usually only one of both
steps failing.

On SLES 10 the problem was that the partition node didn't show up or after
a certain amount of (successful) chccwdevs the kernel could not bring the
device online any more and an reboot of the guest was required.

sync or other tricks I know do not really solve the problem and also udev
is disappointing since it tells that every this ins save which is not the
case. I used dasd_config from SLES 11 but it didn't solve the problem
either.

The chances are high that between one of these steps the situation arises.
Having the 30 seconds fixed delay between all the steps makes the process
including formatting and creation of the filesystem quite long. Personally
I would say unacceptable long.

We use here an IBM z/10 with DS 8700 for the disks. z/VM is 6.2 on latest
RSU  So it is original and fast equipment and there the situation still
appears.

Kind regards,
Florian





On Fri, Jul 27, 2012 at 8:23 PM, David Boyes  wrote:

> > I believe that's the piece that's missing (for most people).  I can
> easily
> > reproduce the problem on my SLES11 SP2 system with this script:
> > vmcp define vfb-512 302 2000
> > date +%H:%M:%S.%N
> > chccwdev -e 0.0.0302
> > mkswap /dev/disk/by-path/ccw-0.0.0302-part1
>
> Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm
> settle' after the 'chccwdev -e', I still get a failure about 3 times out of
>  100 attempts, though.
>
> Alan may be on to something with the timeout value for udev for that type
> of device.
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread David Boyes
> I believe that's the piece that's missing (for most people).  I can easily
> reproduce the problem on my SLES11 SP2 system with this script:
> vmcp define vfb-512 302 2000
> date +%H:%M:%S.%N
> chccwdev -e 0.0.0302
> mkswap /dev/disk/by-path/ccw-0.0.0302-part1

Yeah, that's pretty much guaranteed to fail. If you insert a 'udevadm settle' 
after the 'chccwdev -e', I still get a failure about 3 times out of  100 
attempts, though. 

Alan may be on to something with the timeout value for udev for that type of 
device. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Mark Post
>>> On 7/27/2012 at 01:26 PM, Michael MacIsaac  wrote: 
> I never got the chccwdev to fail.

If you did the test on SLES11 or later, I don't think that command will fail 
since it uses /proc/cio_settle.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Michael MacIsaac
Mark,

> I can easily reproduce the problem on my SLES11 SP2 system with this
script:
> ...

Nice test case! I modified it a bit :))  I never got the chccwdev to fail.
 I did see the mkswap fail regularly.  Then I randomly added a "udevadm
settle" after the chccwdev -e. Every time the udevadm settle kicks in, the
mkswap works! So maybe the solution is as easy as adding a udevadm settle
after the chccwdev -e?

Here's the test run:

# for i in {1..10}
> do
>   testudev
> done
seed is small - add udevadm settle
SUCCESS
seed is large
FAILURE
seed is large
FAILURE
seed is small - add udevadm settle
SUCCESS
seed is small - add udevadm settle
SUCCESS
seed is large
FAILURE
seed is large
FAILURE
seed is large
FAILURE
seed is large
FAILURE
seed is large
FAILURE

Here's the code:

# cat testudev
#!/bin/bash
let seed=$RANDOM
if [ $seed -lt 16384 ]; then # add a udevsettle
  echo "seed is small - add udevadm settle"
else
  echo "seed is large"
fi
vmcp define vfb-512 302 2000 > /dev/null
chccwdev -e 0.0.0302 > /dev/null
rc=$?
if [ $rc != 0 ]; then
  echo "return code from chccwdev -e 0.0.0302 = $rc"
fi
if [ $seed -lt 16384 ]; then # add a udevsettle
  udevadm settle
fi
mkswap /dev/disk/by-path/ccw-0.0.0302-part1 > /dev/null 2>&1
rc=$?
if [ $rc != 0 ]; then
  echo "FAILURE"
else
  echo "SUCCESS"
fi
chccwdev -d 0.0.0302 > /dev/null
rc=$?
if [ $rc != 0 ]; then
  echo "return code from chccwdev -d 0.0.0302 = $rc"
fi
vmcp det 302 > /dev/null
"Mike MacIsaac" 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Alan Altmark
On Friday, 07/27/2012 at 11:33 EDT, Sebastian Ott
 wrote:
> The process of setting the device online involves generic path
> verification work done by the Common IO Layer and device specific
> online processing done by the device driver (DASD in this case). Once
> the DASD driver finished its work and created a block device, userspace
> is informed about this via uevents. After that chccwdev returns. The
> only thing that's missing now is udev creating a device node and that's
> covered via udev settle.

Is it possible that the udev settle timeout is set to zero, preventing it
from waiting?  (I see that you can set up a udev debug log to see what
udev is doing.)

Alan Altmark

Senior Managing z/VM and Linux Consultant
IBM System Lab Services and Training
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution ?

2012-07-27 Thread David Boyes
> We have a set of scripts that Setup the WHOLE multipath SAN disk for us.
> From the chccwdev, zfcp_*, to fdisk
> and format and multipath setup and mount (Yes the WHOLE thing).  We
> have found by placing sleep for 15 seconds between commands , especially
> the hardware level ones, increased our reliability for success.

Yeah, but the timing for "reliably" doing that appears to depend a lot on what 
else is going on -- could be 1 second or 5 or 15 -- and it's inefficient to sit 
there and wait or poll if you don't have to. That's why I'm looking for a 
reliable "yes, you can proceed, it's ready to go" indicator. 

Leaving the default as the current somewhat async behavior is OK, but I'd 
really like a "block until you're absolutely sure the action is complete and 
functioning" option.  


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Mark Post
>>> On 7/27/2012 at 11:15 AM, Sebastian Ott  wrote: 
> Once
> the DASD driver finished its work and created a block device, userspace
> is informed about this via uevents. After that chccwdev returns. The
> only thing that's missing now is udev creating a device node and that's
> covered via udev settle.

I believe that's the piece that's missing (for most people).  I can easily 
reproduce the problem on my SLES11 SP2 system with this script:
vmcp define vfb-512 302 2000
date +%H:%M:%S.%N
chccwdev -e 0.0.0302
mkswap /dev/disk/by-path/ccw-0.0.0302-part1
date +%H:%M:%S.%N
udevadm settle
chccwdev -d 0.0.0302
vmcp det 302

If fails almost every time. (And if I leave the udevadm settle command out 
before the chccwdev -d command, that will usually fail also.)  If I add a 
udevadm settle just after the chccwdev -e, it works.  Since my system is not 
heavily loaded, I can't be sure that it will work 100% of the time, but it 
certainly does a better job than without it.  For my SLES10 system, I had to 
use the udevsettle command, of course.

Our dasd_configure script uses udevsettle/udevadm settle for bringing volumes 
on and offline, and it seems to work fine as well.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread David Boyes
>Once the DASD driver finished
> its work and created a block device, userspace is informed about this via
> uevents. After that chccwdev returns. The only thing that's missing now is
> udev creating a device node and that's covered via udev settle.

Thanks for the walkthrough. The problem appears to be somewhere after the exit 
from udev settle -- the device is not always actually ready and available for 
use when udev settle exits. 
We're seeing the problem on RHEL 6 guests on varying hardware (zPDT, older 
Sharks, etc) -- basically slower hardware that can't necessarily respond 
instantly to requests. I don't know what Florian has, but that seems to be 
characteristic of what we're seeing when it fails. 

>So unless I'm missing something the described method
> should work reliable on current distros. (In a real world scenario you have to
> check return codes of the previous steps before executing the next one.)

Yeah. We're getting rc=0 for the  chccwdev and udev settle, which is what's 
kinda weird about it. I'll see if we can convince udev to wait a bit longer on 
the settle. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution ?

2012-07-27 Thread Ben Duncan
We have a set of scripts that Setup the WHOLE multipath SAN disk for us.
>From the chccwdev, zfcp_*, to fdisk 
and format and multipath setup and mount (Yes the WHOLE thing).  We have
found by placing sleep for 15 seconds between commands , especially the
hardware level ones, increased our reliability for success. 

Ben Duncan - Business Network Solutions, Inc. 336 Elton Road  Jackson
MS, 39212
"Never attribute to malice, that which can be adequately explained by
stupidity"
- Hanlon's Razor


>  Original Message ----
> Subject: Re: Synchronous option for chccwdev -- was there a resolution?
> From: David Boyes 
> Date: Fri, July 27, 2012 9:05 am
> To: LINUX-390@VM.MARIST.EDU
> 
> 
> > On which distro do you have problems with chccwdev?
> > [snip]
> > ..which appears to work fine.
> 
> It's not a distribution issue; it's a timing-dependent issue that has to do 
> with how quickly your hardware responds. Your script will work *most* of the 
> time -- except when it doesn't. 
> Easy way to reproduce the problem is to try your script on a zPDT or a 
> heavily loaded system where the response to requests may not be immediate. 
> The udev settle command isn't a reliable indicator that the device is 
> available. 
> 
> I'm looking for a reliable method that works all the time. 
> 
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Sebastian Ott
On Fri, 27 Jul 2012, David Boyes wrote:
> > On which distro do you have problems with chccwdev?
> > [snip]
> > ..which appears to work fine.
>
> It's not a distribution issue; it's a timing-dependent issue that has to do 
> with how quickly your hardware responds. Your script will work *most* of the 
> time -- except when it doesn't.
> Easy way to reproduce the problem is to try your script on a zPDT or a 
> heavily loaded system where the response to requests may not be immediate. 
> The udev settle command isn't a reliable indicator that the device is 
> available.
>
> I'm looking for a reliable method that works all the time.

OK. We have 3 userspace triggered actions here:

1) make the device available to linux, e.g. via vmcp define
2) set the device online via chccwdev
3) actually use the device via its device node

After 1) and before we could do 2) we have to make sure that Linux
would receive a machine check and that the device recognition steps
done by the Common IO Layer (triggered by the machine check) are
finished. Both is achieved by cio_settle which is invoked by chccwdev
(on current distros) before the online setting starts.

The process of setting the device online involves generic path
verification work done by the Common IO Layer and device specific
online processing done by the device driver (DASD in this case). Once
the DASD driver finished its work and created a block device, userspace
is informed about this via uevents. After that chccwdev returns. The
only thing that's missing now is udev creating a device node and that's
covered via udev settle.

I'm certainly no expert of the DASD driver but I can't see a race window
or timing issue here. So unless I'm missing something the described
method should work reliable on current distros. (In a real world scenario
you have to check return codes of the previous steps before executing
the next one.)

Regards,
Sebstian
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread David Boyes
> On which distro do you have problems with chccwdev?
> [snip]
> ..which appears to work fine.

It's not a distribution issue; it's a timing-dependent issue that has to do 
with how quickly your hardware responds. Your script will work *most* of the 
time -- except when it doesn't. 
Easy way to reproduce the problem is to try your script on a zPDT or a heavily 
loaded system where the response to requests may not be immediate. The udev 
settle command isn't a reliable indicator that the device is available. 

I'm looking for a reliable method that works all the time. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-27 Thread Sebastian Ott
On Thu, 26 Jul 2012, David Boyes wrote:
> A week or two back, someone (I think it was Florian Bilek) asked why there 
> was a delay between invoking chccwdev and the device becoming available, and 
> whether there was an option or command that would exit only when the device 
> was actually available. There was some discussion of the --settle option in 
> udev, but I don't recall seeing a resolution other than "loop on (test for 
> device availability;sleep a few seconds) repeat".
>
> Was there a better solution? If not, could the IBM developers add a --sync 
> option to chccwdev that forces chccwdev to wait until the requested operation 
> is actually completed before exiting?

On which distro do you have problems with chccwdev?
I just did a quick test:
for i in {1..100} ;do vmcp def t3390 as 1234 100 ; chccwdev -e 1234 ; \
udevadm settle; dasdfmt -b 4096 -y /dev/dasde ; chccwdev -d 1234 ;\
vmcp det 1234 ;done

..which appears to work fine.

Regards,
Sebastian

>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-26 Thread Pavelka, Tomas
Is there a known reliable workaround? I have tried running dasdfmt in a script 
right after chccwdev and run into the non-existent device problem. So I tried 
putting a loop after chccwdev waiting for the device to appear in /dev, like 
this:

while [[ ! -b $dev ]] ; do
  sleep 0.1
done

and then ran dasdfmt. That works most of the time, but still isn't reliable 
enough, especially on slow systems I get the occasional 

"DASD format failed: dasdfmt: (format cylinder) IOCTL BIODASDFMT failed. 
(Input/output error)"

I know it is possible to rerun the dasdfmt, or whatever command follows the 
chccwdev, but it would be much nicer to have some indicator that the device is 
really ready. Then I could just write a wrapper around chccwdev and forget 
about this problem. 

Thanks,

Tomas

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Florian 
Bilek
Sent: Friday, July 27, 2012 6:28 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Synchronous option for chccwdev -- was there a resolution?

Hi David,

Thank you for bringing up this topic again. No, unfortunately there was no 
other solution than to rerun the commands.

I think there should be an option for chccwdev to wait till DE/CE is received 
and not to terminate with device busy.

Kind regards,
Florian

On Thu, Jul 26, 2012 at 6:52 PM, David Boyes  wrote:

> A week or two back, someone (I think it was Florian Bilek) asked why 
> there was a delay between invoking chccwdev and the device becoming 
> available, and whether there was an option or command that would exit 
> only when the device was actually available. There was some discussion 
> of the --settle option in udev, but I don't recall seeing a resolution 
> other than "loop on (test for device availability;sleep a few seconds) 
> repeat".
>
> Was there a better solution? If not, could the IBM developers add a 
> --sync option to chccwdev that forces chccwdev to wait until the 
> requested operation is actually completed before exiting?
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions, send 
> email to lists...@vm.marist.edu with the message: INFO LINUX-390 or 
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit 
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/


Re: Synchronous option for chccwdev -- was there a resolution?

2012-07-26 Thread Florian Bilek
Hi David,

Thank you for bringing up this topic again. No, unfortunately there was no
other solution than to rerun the commands.

I think there should be an option for chccwdev to wait till DE/CE is
received and not to terminate with device busy.

Kind regards,
Florian

On Thu, Jul 26, 2012 at 6:52 PM, David Boyes  wrote:

> A week or two back, someone (I think it was Florian Bilek) asked why there
> was a delay between invoking chccwdev and the device becoming available,
> and whether there was an option or command that would exit only when the
> device was actually available. There was some discussion of the --settle
> option in udev, but I don't recall seeing a resolution other than "loop on
> (test for device availability;sleep a few seconds) repeat".
>
> Was there a better solution? If not, could the IBM developers add a --sync
> option to chccwdev that forces chccwdev to wait until the requested
> operation is actually completed before exiting?
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/