Re: udev_retry

2011-09-15 Thread Bruce Dubbs
Bryan Kadzban wrote:

> Or sysconfig, or wherever similar scripts are put in Bruce's new setup.

Just to mention the layout, what I have is:

/lib/services   (network service scripts)
/lib/lsb(symlink to /lib/services/, init-functions)

/etc/sysconfig  (All config files here, no subdirectories)
/etc/init.d (symlink to /etc/rc.d/init.d/ for
  compatibility/convenience)

/etc/rc.d/{init,0..6,S}.d (standard bootscript directories)

   -- Bruce


-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: udev_retry

2011-09-15 Thread Bryan Kadzban
Nathan Coulson wrote:
> Another thought (one I have not actually tested, forgive me if It's 
> not possible) is trigger only block devices in the first pass, then 
> try devices/subsystems on the 2nd pass?

DJ Lucas wrote:
> Can we not simply re-trigger all known affected subsystems with a
> subsystem match?

Ooo, interesting.  I believe either of these should work fine.

(It would also be possible to make udev_retry blindly retry *all*
events, but that will make it take slightly longer to finish the
settle call, as well.  :-) )

> Now, if that would work well enough, then just add a configuration
> file for the udev_retry bootscript so that it can be extended in BLFS
> for say ALSA, and then parse the list.

Yeah.  It would also work to make the pre-checkfs/pre-mountfs udevadm
trigger call have a list of subsystems to trigger, of course, if we
triggered just the block subsystem by default (as Nathan suggested).

> for SUBSYSTEM in `grep -v '^#' /etc/udev_retry.conf`

Or sysconfig, or wherever similar scripts are put in Bruce's new setup.
But yeah.

> you could even write a message for each one if you wanted to have
> more verbose output in the event of a failure, or a stepping like we
> do in mountvirtfs.

I like the mountvirtfs / modules scripts' approaches, with a config file
containing a list of things to process.  At that point it's only a
question of whether we want to use this method to decide what devices
are necessary for checkfs/mountfs, or we want to use this method to
decide what devices might need to be retried.

I think that finding the devices necessary for checkfs/mountfs might be
a bit more fragile, actually; we would have to ensure that none of the
udev rules for the storage devices require anything else.  (They should
not, since those might be the first events triggered, but who knows what
might happen in the future, if upstream lives in a world where every
system runs an initramfs that mounts these FSes for them.)

Although, hmm.  Either way here, there's a possible problem, with
symlinks for disk devices.  If the USB ID file isn't present, then it's
possible that the /etc/fstab entry for /usr refers to a symlink that
relies on this file.  Of course, in that case you're just as screwed
even if you have an initramfs that does this mounting (since the
initramfs doesn't have the file either), so it's probably not worth
defending against.



signature.asc
Description: OpenPGP digital signature
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: Bootscripts (again)

2011-09-15 Thread Bruce Dubbs
xinglp wrote:
> 在 2011年9月16日 上午6:19,Bruce Dubbs  
> 写道:
>> I've been reworking the bootscripts again. Â I hope to have something
>> available by the weekend. Â Here is where I am right now.

>> /lib/lsb is a symlink to /etc/services with files:
>> Â  init-functions
>> Â  ipv4-static
>> Â  ipv4-static-route

> 6.27. Iana-Etc-2.30 created /etc/services
> http://www.linuxfromscratch.org/lfs/view/development/chapter06/iana-etc.html

The above is a typo.  The correct line is:

   /lib/lsb is a symlink to /lib/services with files:

Thanks for point that out.

   -- Bruce
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: Bootscripts (again)

2011-09-15 Thread xinglp
在 2011年9月16日 上午6:19,Bruce Dubbs  写道:
> I've been reworking the bootscripts again.  I hope to have something
> available by the weekend.  Here is where I am right now.
>
> Bootscript changes
>
> Addded interactive capability
> Append /run/var/bootlog to /var/log/boot.log at the end of boot sequence
> /etc/init.d is a symlink to /etc/rc.d/init.d
>
> rc file now runs `source /etc/sysconfig/rc.site` if present
>   rc uses /bin/bash
>   rc.site may override variables set by /etc/init.d/rc
>   Allows for interactive boot (if IPROMPT="yes")
>   Allows for setting /fastboot when doing reboot (init 6) if FASTBOOT
> is set (skips fsck)
>   Skips waiting for user responses for errors if HEADLESS is set
>   Skips cleaning /tmp if SKIPTMPCLEAN is set
>
> The settings in clock, console, and network can be optionally placed in
> rc.site and the separate files dropped.  If present, the separate files
> override rc.site.
>
> ifup/ifdown are now in /sbin
> ifup/ifdown will properly add/remove multiple IP addresses when requested.
>
> /lib/lsb is a symlink to /etc/services with files:
>   init-functions
>   ipv4-static
>   ipv4-static-route
6.27. Iana-Etc-2.30 created /etc/services
http://www.linuxfromscratch.org/lfs/view/development/chapter06/iana-etc.html
>
> All scripts ihave been rewritten to use lsb library functions log_*_msg,
> start_daemon, killproc, pidofproc.
>
>   log* functions write to /run/var/bootlog with timestamp
>
>   There are a few supplementary functions in init-functions:
>    log_success_msg2 (no timestamp)
>    log_failure_msg2 (no timestamp)
>    log_info_msg2    (no timestamp)
>    evaluate_retval  (finish user/log message)
>    wait_for_user    ( if ! HEADLESS, read)
>
> TODO
>   Write help into ifup/ifdown
>   Write man pages for ifup/ifdown
>   Update the LFS Book to reflect changes
>   Testing
>   Testing
>   Testing
>
> A lot of this has been taken from DJ's work, so many thanks to him.
>
>   -- Bruce
>
> --
> http://linuxfromscratch.org/mailman/listinfo/lfs-dev
> FAQ: http://www.linuxfromscratch.org/faq/
> Unsubscribe: See the above information page
>
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: udev_retry

2011-09-15 Thread DJ Lucas
On 09/15/2011 02:38 PM, Bruce Dubbs wrote:
> References are threads starting at:
> http://www.linuxfromscratch.org/pipermail/lfs-dev/2011-August/064960.html
> http://linuxfromscratch.org/pipermail/lfs-dev/2011-September/065130.html
>
> We are in a Catch-22 situation with udev_retry.  Here's a rundown:
>
> We need to start udev (S10udev) before mounting filesystems (S40mountfs)
> so that the device entries are available in order to mount partitions.
>
> udev will create some devices and may run some programs before all file
> systems are mounted (setclock) that need directories that are
> potentially not mounted (/usr, /var).
>
> The same issues come up for BLFS in alsa.
>
> Currently we are addressing these types of problems with the command:
>
> /sbin/udevadm trigger --type=failed --action=add
>
> in udev_retry.  The problem is that '--type=failed' has been deprecated
> upstream and we need to plan for that.  We also get a nasty warning
> message on every boot about the deprecation.
>
> In the infrequent case of a changed network card, we also need to be
> able to copy udev generated files from the tmpfs /run directory to /etc
> after / is remounted r/w, but that can be moved to the mountfs script
> from the udev_retry script.
>
> There are options about what to do right now:
>
> 1.  Leave in the warning message and optionally write something about it
> in the book.
>
> 2.  Add 2>/dev/null to the udevadm command above.
>
> 3.  Modify the source to remove the warning (delete 1 line).
Go with 3 for the time being, or see below

> 
>
> 4.  Reinsert the deleted retry code into udev with a patch.
>
>

Or recreate the functionality outside of udev. What is the consequence 
of manually triggering an add event that has already happened? From 
here, it looks like the device nodes are simply recreated (and rules 
processed appropriately). Look:

root@name64 [ /etc/udev/rules.d ]# echo "echo 'YES'> /etc/test.txt" >> 
/etc/init.d/setclock
root@name64 [ /etc/udev/rules.d ]# rm /etc/test.txt
rm: cannot remove `/etc/test.txt': No such file or directory
root@name64 [ /etc/udev/rules.d ]# ls -l /dev/rtc*
lrwxrwxrwx 1 root root  4 Sep 15 19:22 /dev/rtc -> rtc0
crw-r--r-- 1 root root 254, 0 Sep 15 19:22 /dev/rtc0
root@name64 [ /etc/udev/rules.d ]# udevadm trigger --subsystem-match=rtc 
--action=add
root@name64 [ /etc/udev/rules.d ]# cat /etc/test.txt
YES
root@name64 [ /etc/udev/rules.d ]# ls -l /dev/rtc*
lrwxrwxrwx 1 root root  4 Sep 15 19:24 /dev/rtc -> rtc0
crw-r--r-- 1 root root 254, 0 Sep 15 19:24 /dev/rtc0
root@name64 [ /etc/udev/rules.d ]#

As you might have guessed, I did that a few times before...hence only 2 
minute difference. Can we not simply re-trigger all known affected 
subsystems with a subsystem match? I don't really see the possibility of 
failure here, but I certainly am not the udev aficionado, so I could 
easily be missing something. Now, if that would work well enough, then 
just add a configuration file for the udev_retry bootscript so that it 
can be extended in BLFS for say ALSA, and then parse the list. In the 
udev_retry script, a for loop like so:

(
 failed=0
 for SUBSYSTEM in `grep -v '^#' /etc/udev_retry.conf`
 do
 udevadmin trigger --subsystem-match=$SUBSYSTEM --action=add || 
failed=1
 done
 exit $failed
)

Or function and return, or test on the variable or whatever works well 
in the context of that particular boot script...you could even write a 
message for each one if you wanted to have more verbose output in the 
event of a failure, or a stepping like we do in mountvirtfs.

> I'd like to see some more discussion about this.
>
> -- Bruce
What ya'll think?

-- DJ Lucas


-- 
This message has been scanned for viruses and
dangerous content, and is believed to be clean.

-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: udev_retry

2011-09-15 Thread Nathan Coulson
On Thu, Sep 15, 2011 at 2:49 PM, Bruce Dubbs  wrote:
> Matthew Burgess wrote:
>> On 15/09/2011 20:38, Bruce Dubbs wrote:
>>
>> 
>>
>>> There are options about what to do right now:
>>>
>>> 1.  Leave in the warning message and optionally write something about it
>>> in the book.
>>
>> We try, generally, to accomodate changes in upstream programs.  I'll
>> defer to upstream's views on the fact that blindly retrying failed
>> actions in the hope that the system is now somehow in a state that will
>> enable them to run is not a particularly 'clever' option.  What's the
>> point in retrying them if we can't guarantee that they'll work this time
>> around?
>>
>> So, I think we should do something to remove the warning, and therefore
>> not have to explain it in the book...
>>
>>> 2.  Add 2>/dev/null to the udevadm command above.
>>
>> ...well, I guess that counts as something :)  I'd rather not hide the
>> error.  As you say below, when the --type=failed option is removed, this
>> will just come back and bite us anyway.
>>
>>> 3.  Modify the source to remove the warning (delete 1 line).
>>>
>>> When the --type=failed option is removed, we need to consider some other
>>> options:
>>
>> Same as 2.  This is just putting our hands over our eyes and pretending
>> the problem's gone away :)
>
> The above options were only intended for udev-173.   Removing the
> warning would only be a solution while --type=failed still exists.  The
> options below are intended for the situation after the option is removed.
>
> Removing the error message in this case only removes an ugly warning
> that we already know about.
>
> These options are intended after --type=failed support has been removed.
>
>>> 2.  Declare that separate /usr and /var partitions are not supported.
>>> They could be supported with an initrd that mounts the partitions before
>>> the kernel starts, but that would need to be in a hint like the one
>>> Bryan wrote in 2005.
>>
>> I'm 50/50 on this one.  It does have the benefits of being a) simple and
>> b) toeing the upstream line.  It does though, obviously mean we drop
>> support for a configuration that we've long supported.
>>
>>> 3.  Declare that we do not support hwclock settings that are not GMT.
>>
>> Well, you'd be insane not to set it to GMT anyway, right? :)  Again,
>> whilst I think this simplifies things, it does mean we'd drop support
>> for a configuration we've supported in the past.
>
> Well, I certainly use GMT, but the issue is dropping support for some
> users and I don't like that.
>
>>> 4.  Reinsert the deleted retry code into udev with a patch.
>>
>> Where was the smiley on the end of that one, Bruce? :)  I really
>> wouldn't want us to do this.
>
> How do you do a half-smiley.  I think I could do this, but I don't
> really think it would be appropriate for LFS.
>
>>> 5.  Since (I think) udev commands are run asynchronously, change the
>>> affected scripts (e.g. setclock) to wait in a loop for the appropriate
>>> partitions to be mounted.  For example:
>>>
>>>     for i in {1..10}; do
>>>       if [ -d /usr/share ]&&  [ -d /var/lib ]; then break; fi
>>>       sleep 1
>>>     done
>>>
>>>     if [ $i -eq 10 ]; then error(); fi
>>>     ...
>>
>> Now, if it's true that udev actions are in fact asynchronous, this is
>> probably the most pragmatic of solutions.  It's not clever, but then it
>> doesn't really have to be.  It could even be factored out into a
>> function, such that bootscripts could just call something like
>> "wait_for_path('/usr/share')"
>
> I'll note that on my system, setclock runs almost immediately after udev
> starts, but udev takes a while to finish:
>
> Sep 15 15:29:17 -05:00 (none)  Populating /dev with device nodes...
> Sep 15 15:29:17 -05:00 (none)  Setting system clock... OK
>  OK
> Sep 15 15:29:23 -05:00 (none)  Mounting root file system in read-only
> mode...  OK
> Sep 15 15:29:24 -05:00 (none)  Remounting root file system in read-write
> mode... OK
> Sep 15 15:29:24 -05:00 (none)  Recording existing mounts in /etc/mtab... OK
> Sep 15 15:29:24 -05:00 (none)  Mounting remaining file systems... OK
>
> So that's 7 seconds for udev.  Other systems may take longer.
>
> Also note that the two OK's in a row are for udev and setclock.  I don't
> know for sure which OK comes first, but I suspect setclock because I
> don't have a separate /usr or /var partition.
>
>>> 6.  ???
>>
>> Use a combination of systemd + initrd :-)
>
> I'm glad you used a smiley there.
>
>> So, based on the above, 5 is definitely something to look into I think.
>>   If that doesn't pan out, then I think option 2 is the next 'least worst'.
>
> I agree.
>
>   -- Bruce


Another thought (one I have not actually tested, forgive me if It's
not possible) is trigger only block devices in the first pass, then
try devices/subsystems on the 2nd pass?



It looks like udevadm trigger could match "specific" devices,  what
about something like

udevadm trigger --type=devices --subsystem-match block
udevadm trigger

Bootscripts (again)

2011-09-15 Thread Bruce Dubbs
I've been reworking the bootscripts again.  I hope to have something 
available by the weekend.  Here is where I am right now.

Bootscript changes

Addded interactive capability
Append /run/var/bootlog to /var/log/boot.log at the end of boot sequence
/etc/init.d is a symlink to /etc/rc.d/init.d

rc file now runs `source /etc/sysconfig/rc.site` if present
   rc uses /bin/bash
   rc.site may override variables set by /etc/init.d/rc
   Allows for interactive boot (if IPROMPT="yes")
   Allows for setting /fastboot when doing reboot (init 6) if FASTBOOT 
is set (skips fsck)
   Skips waiting for user responses for errors if HEADLESS is set
   Skips cleaning /tmp if SKIPTMPCLEAN is set

The settings in clock, console, and network can be optionally placed in
rc.site and the separate files dropped.  If present, the separate files
override rc.site.

ifup/ifdown are now in /sbin
ifup/ifdown will properly add/remove multiple IP addresses when requested.

/lib/lsb is a symlink to /etc/services with files:
   init-functions
   ipv4-static
   ipv4-static-route

All scripts ihave been rewritten to use lsb library functions log_*_msg,
start_daemon, killproc, pidofproc.

   log* functions write to /run/var/bootlog with timestamp

   There are a few supplementary functions in init-functions:
log_success_msg2 (no timestamp)
log_failure_msg2 (no timestamp)
log_info_msg2(no timestamp)
evaluate_retval  (finish user/log message)
wait_for_user( if ! HEADLESS, read)

TODO
   Write help into ifup/ifdown
   Write man pages for ifup/ifdown
   Update the LFS Book to reflect changes
   Testing
   Testing
   Testing

A lot of this has been taken from DJ's work, so many thanks to him.

   -- Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: udev_retry

2011-09-15 Thread Bruce Dubbs
Matthew Burgess wrote:
> On 15/09/2011 20:38, Bruce Dubbs wrote:
> 
> 
> 
>> There are options about what to do right now:
>>
>> 1.  Leave in the warning message and optionally write something about it
>> in the book.
> 
> We try, generally, to accomodate changes in upstream programs.  I'll 
> defer to upstream's views on the fact that blindly retrying failed 
> actions in the hope that the system is now somehow in a state that will 
> enable them to run is not a particularly 'clever' option.  What's the 
> point in retrying them if we can't guarantee that they'll work this time 
> around?
> 
> So, I think we should do something to remove the warning, and therefore 
> not have to explain it in the book...
> 
>> 2.  Add 2>/dev/null to the udevadm command above.
> 
> ...well, I guess that counts as something :)  I'd rather not hide the 
> error.  As you say below, when the --type=failed option is removed, this 
> will just come back and bite us anyway.
> 
>> 3.  Modify the source to remove the warning (delete 1 line).
>>
>> When the --type=failed option is removed, we need to consider some other
>> options:
> 
> Same as 2.  This is just putting our hands over our eyes and pretending 
> the problem's gone away :)

The above options were only intended for udev-173.   Removing the 
warning would only be a solution while --type=failed still exists.  The 
options below are intended for the situation after the option is removed.

Removing the error message in this case only removes an ugly warning 
that we already know about.

These options are intended after --type=failed support has been removed.

>> 2.  Declare that separate /usr and /var partitions are not supported.
>> They could be supported with an initrd that mounts the partitions before
>> the kernel starts, but that would need to be in a hint like the one
>> Bryan wrote in 2005.
> 
> I'm 50/50 on this one.  It does have the benefits of being a) simple and 
> b) toeing the upstream line.  It does though, obviously mean we drop 
> support for a configuration that we've long supported.
> 
>> 3.  Declare that we do not support hwclock settings that are not GMT.
> 
> Well, you'd be insane not to set it to GMT anyway, right? :)  Again, 
> whilst I think this simplifies things, it does mean we'd drop support 
> for a configuration we've supported in the past.

Well, I certainly use GMT, but the issue is dropping support for some 
users and I don't like that.

>> 4.  Reinsert the deleted retry code into udev with a patch.
> 
> Where was the smiley on the end of that one, Bruce? :)  I really 
> wouldn't want us to do this.

How do you do a half-smiley.  I think I could do this, but I don't 
really think it would be appropriate for LFS.

>> 5.  Since (I think) udev commands are run asynchronously, change the
>> affected scripts (e.g. setclock) to wait in a loop for the appropriate
>> partitions to be mounted.  For example:
>>
>> for i in {1..10}; do
>>   if [ -d /usr/share ]&&  [ -d /var/lib ]; then break; fi
>>   sleep 1
>> done
>>
>> if [ $i -eq 10 ]; then error(); fi
>> ...
> 
> Now, if it's true that udev actions are in fact asynchronous, this is 
> probably the most pragmatic of solutions.  It's not clever, but then it 
> doesn't really have to be.  It could even be factored out into a 
> function, such that bootscripts could just call something like 
> "wait_for_path('/usr/share')"

I'll note that on my system, setclock runs almost immediately after udev 
starts, but udev takes a while to finish:

Sep 15 15:29:17 -05:00 (none)  Populating /dev with device nodes...
Sep 15 15:29:17 -05:00 (none)  Setting system clock... OK
  OK
Sep 15 15:29:23 -05:00 (none)  Mounting root file system in read-only 
mode...  OK
Sep 15 15:29:24 -05:00 (none)  Remounting root file system in read-write 
mode... OK
Sep 15 15:29:24 -05:00 (none)  Recording existing mounts in /etc/mtab... OK
Sep 15 15:29:24 -05:00 (none)  Mounting remaining file systems... OK

So that's 7 seconds for udev.  Other systems may take longer.

Also note that the two OK's in a row are for udev and setclock.  I don't 
know for sure which OK comes first, but I suspect setclock because I 
don't have a separate /usr or /var partition.

>> 6.  ???
> 
> Use a combination of systemd + initrd :-)

I'm glad you used a smiley there.

> So, based on the above, 5 is definitely something to look into I think. 
>   If that doesn't pan out, then I think option 2 is the next 'least worst'.

I agree.

   -- Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: udev_retry

2011-09-15 Thread Matthew Burgess
On 15/09/2011 20:38, Bruce Dubbs wrote:



> There are options about what to do right now:
>
> 1.  Leave in the warning message and optionally write something about it
> in the book.

We try, generally, to accomodate changes in upstream programs.  I'll 
defer to upstream's views on the fact that blindly retrying failed 
actions in the hope that the system is now somehow in a state that will 
enable them to run is not a particularly 'clever' option.  What's the 
point in retrying them if we can't guarantee that they'll work this time 
around?

So, I think we should do something to remove the warning, and therefore 
not have to explain it in the book...

> 2.  Add 2>/dev/null to the udevadm command above.

...well, I guess that counts as something :)  I'd rather not hide the 
error.  As you say below, when the --type=failed option is removed, this 
will just come back and bite us anyway.

> 3.  Modify the source to remove the warning (delete 1 line).
>
> When the --type=failed option is removed, we need to consider some other
> options:

Same as 2.  This is just putting our hands over our eyes and pretending 
the problem's gone away :)

> 1.  Delete the affected udev rules that run the problem commands (in
> udev/rules.d/55-lfs.rules) and run them explicitly when we want them to
> run.  This could create a potential, but unlikely, race condition.

I'm not particularly enamored by this one.  The rules are in there for a 
reason, and in particular, if we get rid of the snd* devices action, 
that will stop folks from being able to plug and play sound devices in 
particular.

> 2.  Declare that separate /usr and /var partitions are not supported.
> They could be supported with an initrd that mounts the partitions before
> the kernel starts, but that would need to be in a hint like the one
> Bryan wrote in 2005.

I'm 50/50 on this one.  It does have the benefits of being a) simple and 
b) toeing the upstream line.  It does though, obviously mean we drop 
support for a configuration that we've long supported.

> 3.  Declare that we do not support hwclock settings that are not GMT.

Well, you'd be insane not to set it to GMT anyway, right? :)  Again, 
whilst I think this simplifies things, it does mean we'd drop support 
for a configuration we've supported in the past.

> 4.  Reinsert the deleted retry code into udev with a patch.

Where was the smiley on the end of that one, Bruce? :)  I really 
wouldn't want us to do this.

> 5.  Since (I think) udev commands are run asynchronously, change the
> affected scripts (e.g. setclock) to wait in a loop for the appropriate
> partitions to be mounted.  For example:
>
> for i in {1..10}; do
>   if [ -d /usr/share ]&&  [ -d /var/lib ]; then break; fi
>   sleep 1
> done
>
> if [ $i -eq 10 ]; then error(); fi
> ...

Now, if it's true that udev actions are in fact asynchronous, this is 
probably the most pragmatic of solutions.  It's not clever, but then it 
doesn't really have to be.  It could even be factored out into a 
function, such that bootscripts could just call something like 
"wait_for_path('/usr/share')"

> 6.  ???

Use a combination of systemd + initrd :-)

So, based on the above, 5 is definitely something to look into I think. 
  If that doesn't pan out, then I think option 2 is the next 'least worst'.

Thanks,

Matt.
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


Re: udev_retry

2011-09-15 Thread Matthew Burgess
On 15/09/2011 20:38, Bruce Dubbs wrote:



> There are options about what to do right now:
>
> 1.  Leave in the warning message and optionally write something about it
> in the book.

We try, generally, to accomodate changes in upstream programs.  I'll 
defer to upstream's views on the fact that blindly retrying failed 
actions in the hope that the system is now somehow in a state that will 
enable them to run is not a particularly 'clever' option.  What's the 
point in retrying them if we can't guarantee that they'll work this time 
around?

So, I think we should do something to remove the warning, and therefore 
not have to explain it in the book...

> 2.  Add 2>/dev/null to the udevadm command above.

...well, I guess that counts as something :)  I'd rather not hide the 
error.  As you say below, when the --type=failed option is removed, this 
will just come back and bite us anyway.

> 3.  Modify the source to remove the warning (delete 1 line).
>
> When the --type=failed option is removed, we need to consider some other
> options:

Same as 2.  This is just putting our hands over our eyes and pretending 
the problem's gone away :)

> 1.  Delete the affected udev rules that run the problem commands (in
> udev/rules.d/55-lfs.rules) and run them explicitly when we want them to
> run.  This could create a potential, but unlikely, race condition.

I'm not particularly enamored by this one.  The rules are in there for a 
reason, and in particular, if we get rid of the snd* devices action, 
that will stop folks from being able to plug and play sound devices in 
particular.

> 2.  Declare that separate /usr and /var partitions are not supported.
> They could be supported with an initrd that mounts the partitions before
> the kernel starts, but that would need to be in a hint like the one
> Bryan wrote in 2005.

I'm 50/50 on this one.  It does have the benefits of being a) simple and 
b) toeing the upstream line.  It does though, obviously mean we drop 
support for a configuration that we've long supported.

> 3.  Declare that we do not support hwclock settings that are not GMT.

Well, you'd be insane not to set it to GMT anyway, right? :)  Again, 
whilst I think this simplifies things, it does mean we'd drop support 
for a configuration we've supported in the past.

> 4.  Reinsert the deleted retry code into udev with a patch.

Where was the smiley on the end of that one, Bruce? :)  I really 
wouldn't want us to do this.

> 5.  Since (I think) udev commands are run asynchronously, change the
> affected scripts (e.g. setclock) to wait in a loop for the appropriate
> partitions to be mounted.  For example:
>
> for i in {1..10}; do
>   if [ -d /usr/share ]&&  [ -d /var/lib ]; then break; fi
>   sleep 1
> done
>
> if [ $i -eq 10 ]; then error(); fi
> ...

Now, if it's true that udev actions are in fact asynchronous, this is 
probably the most pragmatic of solutions.  It's not clever, but then it 
doesn't really have to be.  It could even be factored out into a 
function, such that bootscripts could just call something like 
"wait_for_path('/usr/share')"

> 6.  ???

Use a combination of systemd + initrd :-)

So, based on the above, 5 is definitely something to look into I think. 
  If that doesn't pan out, then I think option 2 is the next 'least worst'.

Thanks,

Matt.
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page


udev_retry

2011-09-15 Thread Bruce Dubbs
References are threads starting at:
http://www.linuxfromscratch.org/pipermail/lfs-dev/2011-August/064960.html
http://linuxfromscratch.org/pipermail/lfs-dev/2011-September/065130.html

We are in a Catch-22 situation with udev_retry.  Here's a rundown:

We need to start udev (S10udev) before mounting filesystems (S40mountfs) 
so that the device entries are available in order to mount partitions.

udev will create some devices and may run some programs before all file 
systems are mounted (setclock) that need directories that are 
potentially not mounted (/usr, /var).

The same issues come up for BLFS in alsa.

Currently we are addressing these types of problems with the command:

   /sbin/udevadm trigger --type=failed --action=add

in udev_retry.  The problem is that '--type=failed' has been deprecated 
upstream and we need to plan for that.  We also get a nasty warning 
message on every boot about the deprecation.

In the infrequent case of a changed network card, we also need to be 
able to copy udev generated files from the tmpfs /run directory to /etc 
after / is remounted r/w, but that can be moved to the mountfs script 
from the udev_retry script.

There are options about what to do right now:

1.  Leave in the warning message and optionally write something about it 
in the book.

2.  Add 2>/dev/null to the udevadm command above.

3.  Modify the source to remove the warning (delete 1 line).

When the --type=failed option is removed, we need to consider some other 
options:

1.  Delete the affected udev rules that run the problem commands (in 
udev/rules.d/55-lfs.rules) and run them explicitly when we want them to 
run.  This could create a potential, but unlikely, race condition.

2.  Declare that separate /usr and /var partitions are not supported. 
They could be supported with an initrd that mounts the partitions before 
the kernel starts, but that would need to be in a hint like the one 
Bryan wrote in 2005.

3.  Declare that we do not support hwclock settings that are not GMT.

4.  Reinsert the deleted retry code into udev with a patch.

5.  Since (I think) udev commands are run asynchronously, change the 
affected scripts (e.g. setclock) to wait in a loop for the appropriate 
partitions to be mounted.  For example:

   for i in {1..10}; do
 if [ -d /usr/share ] && [ -d /var/lib ]; then break; fi
 sleep 1
   done

   if [ $i -eq 10 ]; then error(); fi
   ...

6.  ???

I'd like to see some more discussion about this.

   -- Bruce
-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page