DJ Lucas wrote:

Yes at least the echo is required. Get to that in a moment. The loop is as well, but it's still undecided according to the comments in the ticket. I'll be able to test at least the loop doesn't do anything silly tonight, but I don't believe I have a reliable way to reproduce the bug as my system is now.

I treat the ticket 1720 as presumably-closed if the bootscript changes below are implemented, and the udev_update branch can be merged then. This mail to lfs-dev is only to state this, and ask for discussion if such discussion is needed (IMHO, it is not needed).

The status quo is:

1) Uevents that are triggered by echoing to sysfs "uevent" files are processed by udev asynchronously. The end of the loop that does the "echo" only means that the kernel has put all uevents into the netlink socket buffer. There is absolutely no guarantee that udevd received them all at that moment.

2) Therefore, upstream has the recommendation to wait for udevd to process all these events, and all uevents raised as a consequence of "modprobe" called from udev rules in response to earlier uevents. Upstream provides a way to know if the udev queue (that is not the same as in-kernel netlink socket buffer!) is empty, and proposes the following shell script snippet:

        udevd --daemon
        mkdir -p /dev/.udev/queue
        walk_sysfs
# until we know how to do better, just wait for _all_ events to finish
        loop=300
        while test -d /dev/.udev/queue; do
            sleep 0.1
            test "$loop" -gt 0 || break
            loop=$(($loop - 1))
        done
        test "$loop" -gt 0
        evaluate_retval

3) The shell script above does not work. Udev queue disappears for a moment (because udevd has no way to know if there will be any further uevents), and then reappears, but the loop thinks that it is done! So I proposed an alternative loop that not only waits for the queue to disappear, but retries several times in order to make sure that it doesn't reappear again:

        udevd --daemon
        mkdir -p /dev/.udev/queue
        walk_sysfs
# until we know how to do better, just wait for _all_ events to finish
        loop=300
        confirm=0
        while true ; do
            sleep 0.1
            test -d /dev/.udev/queue && confirm=0 || \
                confirm=$(( $confirm + 1 ))
            loop=$(( $loop - 1 ))
            test $loop -gt 0 || break
            test $confirm -lt 60 || break
        done
        test "$loop" -gt 0
        evaluate_retval

Archaic says this produces much better results on his system.

4) It is imperative that we don't trust upstream (because they ignore problem (3)), and do our own checking. In order to make sure that no initial uevents escape the loop, compile the following simple event recorder that adds the uevent contents to the /dev/bug file if the file exists:

/* bug.c: Simple event recorder, gcc -o /lib/udev/bug bug.c */
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <argz.h>

int main(int argc, char * argv[])
{
        char * envz;
        size_t len;
        int bug;
        bug = open("/dev/bug", O_WRONLY | O_APPEND);
        if (bug == -1)
                return 0;
        setenv("_SEPARATOR", "------------------------------------", 1);
        argz_create(environ, &envz, &len);
        argz_stringify(envz, len, '\n');
        envz[len-1]='\n';
        write(bug, envz, len);
        close(bug);
        free(envz);
        return 0;
}

Add this rule to the separate file, 90-debug.rules:

ACTION=="add", RUN+="bug"

Add this shell script snippet to the udev bootscripts after the waiting loop. This snippt checks if there are any uevents within 6 seconds from the end of the loop. Assuming that the user doesn't plug devices in and out during the boot process, such uevents indicate a bug in the waiting loop.

.....
            test $confirm -lt 60 || break
        done
        >/dev/bug
        test "$loop" -gt 0
        evaluate_retval
        sleep 5
        if test -s /dev/bug; then
            mv /dev/bug /dev/bugreport
            boot_mesg "Please paste the /dev/bugreport file to" ${WARNING}
            boot_mesg "http://wiki.linuxfromscratch.org/lfs/ticket/1720";
boot_mesg "Otherwise, the next version of LFS may be unbootable on your system!"
            echo_failure
            sleep 10
        else
            rm -f /dev/bug
        fi

--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to