Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On 8/28/19 5:41 PM, Richard Purdie wrote: On Wed, 2019-08-28 at 15:24 -0500, Jason Wessel wrote: On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote: On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote: On 8/27/19 5:58 PM, Richard Purdie wrote: Hi Jason, Somehow this change is responsible for this build failure: https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 (steps 5c and 7c so failure during testimage). I have bisected it to this change, I haven't looked into why. Thanks for tracking it down. I am sure how to try an duplicate this. I clicked around to try and find out a bit about what it is running for these phases of the build but it is not very obvious. Is there a local.conf I can try along with what ever commands it ran? The configuration its using is shown in https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio for each step. I tried the configuration on my system but it seems to work fine, see below. Is there a way to get complete log file of the boot on the broken host? I am also curious if it fails every time or not. /home/pokybuild/yocto-worker/qa- extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image- sato/1.0-r0/testimage/qemu_boot_log.2019082722 I am not sure it will tell us anything further or not, but it certainly looks like the system didn't boot correctly in the first place. I've rerun it as https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/983 I'll take a further look at it tomorrow. I have no idea why systemd emitted this message: [ 4.750194] systemd[1]: Unnecessary job for /dev/ttyS1 was removed. [ 4.751666] systemd[1]: Unnecessary job for /dev/ttyS0 was removed. [ OK ] Started Serial Getty on ttyS0. Stopping Serial Getty on ttyS0... I have an image now with the problem so, I'll make a copy of it in case the problem "goes away" again. It will be interesting to see what the deal is with this corner case. Cheers, Jason. -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On Wed, 2019-08-28 at 15:24 -0500, Jason Wessel wrote: > On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote: > > On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote: > > > On 8/27/19 5:58 PM, Richard Purdie wrote: > > > > Hi Jason, > > > > Somehow this change is responsible for this build failure: > > > > > > > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 > > > > > > > > (steps 5c and 7c so failure during testimage). > > > > > > > > I have bisected it to this change, I haven't looked into why. > > > > > > Thanks for tracking it down. I am sure how to try an duplicate > > > this. I clicked around to try and find out a bit about what it > > > is > > > running for these phases of the build but it is not very obvious. > > > > > > Is there a local.conf I can try along with what ever commands it > > > ran? > > > > The configuration its using is shown in > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio > > for each step. > > > > I tried the configuration on my system but it seems to work fine, see > below. Is there a way to get complete log file of the boot on the > broken host? I am also curious if it fails every time or not. > > /home/pokybuild/yocto-worker/qa- > extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image- > sato/1.0-r0/testimage/qemu_boot_log.2019082722 > > I am not sure it will tell us anything further or not, but it > certainly looks like the system didn't boot correctly in the first > place. I've rerun it as https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/983 Cheers, Richard -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On Wed, 2019-08-28 at 15:24 -0500, Jason Wessel wrote: > On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote: > > On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote: > > > On 8/27/19 5:58 PM, Richard Purdie wrote: > > > > Hi Jason, > > > > Somehow this change is responsible for this build failure: > > > > > > > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 > > > > > > > > (steps 5c and 7c so failure during testimage). > > > > > > > > I have bisected it to this change, I haven't looked into why. > > > > > > Thanks for tracking it down. I am sure how to try an duplicate > > > this. I clicked around to try and find out a bit about what it > > > is > > > running for these phases of the build but it is not very obvious. > > > > > > Is there a local.conf I can try along with what ever commands it > > > ran? > > > > The configuration its using is shown in > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio > > for each step. > > > > I tried the configuration on my system but it seems to work fine, see > below. Is there a way to get complete log file of the boot on the > broken host? I am also curious if it fails every time or not. > > /home/pokybuild/yocto-worker/qa- > extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image- > sato/1.0-r0/testimage/qemu_boot_log.2019082722 If we'd got this straight after the build then yes but its been recycled now. It did seem to fail repeatedly when I tested it. I'd probably have to add the patch back and retest again to get the log. > I am not sure it will tell us anything further or not, but it > certainly looks like the system didn't boot correctly in the first > place. It does seem like a boot issue somehow... The test case one the autobuilder seems simple enough, systemd boot alone with no sysvinit. Cheers, Richard -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote: On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote: On 8/27/19 5:58 PM, Richard Purdie wrote: Hi Jason, Somehow this change is responsible for this build failure: https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 (steps 5c and 7c so failure during testimage). I have bisected it to this change, I haven't looked into why. Thanks for tracking it down. I am sure how to try an duplicate this. I clicked around to try and find out a bit about what it is running for these phases of the build but it is not very obvious. Is there a local.conf I can try along with what ever commands it ran? The configuration its using is shown in https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio for each step. I tried the configuration on my system but it seems to work fine, see below. Is there a way to get complete log file of the boot on the broken host? I am also curious if it fails every time or not. /home/pokybuild/yocto-worker/qa-extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image-sato/1.0-r0/testimage/qemu_boot_log.2019082722 I am not sure it will tell us anything further or not, but it certainly looks like the system didn't boot correctly in the first place. Jason. Initialising tasks: 100% |### Sstate summary: Wanted 0 Found 0 Missed 0 Current 90 (0% match, 100% complete) NOTE: Executing Tasks NOTE: Setscene tasks completed Started HTTPService on 128.224.149.46:38317 Test requires apt to be installed Stopped HTTPService on 128.224.149.46:38317 Test requires autoconf to be installed Test requires gtk+3 to be installed Test requires autoconf to be installed Started HTTPService on 128.224.149.46:32961 Stopped HTTPService on 128.224.149.46:32961 Test requires gcc to be installed Test requires g++ to be installed Test requires g++ to be installed Test requires make to be installed Test requires python3-pygobject to be installed Test requires gcc to be installed Test requires ldd to be installed Test requires logrotate to be installed Test case logrotate.LogrotateTest.test_2_logrotate depends on logrotate.LogrotateTest.test_1_logrotate_setup but it didn' Not appropiate for systemd image Test requires opkg to be installed Test requires pam to be in DISTRO_FEATURES Test requires ptest-runner to be installed Test requires systemtap to be installed Startup finished in 3.840s (kernel) + 2.157s (userspace) = 5.998s. RESULTS: RESULTS - connman.ConnmanTest.test_connmand_help: PASSED (0.11s) RESULTS - connman.ConnmanTest.test_connmand_running: PASSED (0.11s) RESULTS - date.DateTest.test_date: PASSED (0.63s) RESULTS - df.DfTest.test_df: PASSED (0.11s) RESULTS - dnf.DnfBasicTest.test_dnf_help: PASSED (0.78s) RESULTS - dnf.DnfBasicTest.test_dnf_history: PASSED (0.44s) RESULTS - dnf.DnfBasicTest.test_dnf_info: PASSED (0.37s) RESULTS - dnf.DnfBasicTest.test_dnf_search: PASSED (0.35s) RESULTS - dnf.DnfBasicTest.test_dnf_version: PASSED (0.32s) RESULTS - dnf.DnfRepoTest.test_dnf_exclude: PASSED (5.03s) RESULTS - dnf.DnfRepoTest.test_dnf_install: PASSED (0.78s) RESULTS - dnf.DnfRepoTest.test_dnf_install_dependency: PASSED (1.77s) RESULTS - dnf.DnfRepoTest.test_dnf_install_from_disk: PASSED (1.97s) RESULTS - dnf.DnfRepoTest.test_dnf_install_from_http: PASSED (1.52s) RESULTS - dnf.DnfRepoTest.test_dnf_installroot: PASSED (6.91s) RESULTS - dnf.DnfRepoTest.test_dnf_makecache: PASSED (0.42s) RESULTS - dnf.DnfRepoTest.test_dnf_reinstall: PASSED (0.61s) RESULTS - dnf.DnfRepoTest.test_dnf_repoinfo: PASSED (0.35s) RESULTS - oe_syslog.SyslogTest.test_syslog_running: PASSED (0.12s) RESULTS - oe_syslog.SyslogTestConfig.test_syslog_logger: PASSED (1.21s) RESULTS - oe_syslog.SyslogTestConfig.test_syslog_restart: PASSED (0.21s) RESULTS - parselogs.ParseLogsTest.test_parselogs: PASSED (1.91s) RESULTS - perl.PerlTest.test_perl_works: PASSED (0.11s) RESULTS - ping.PingTest.test_ping: PASSED (0.06s) RESULTS - python.PythonTest.test_python3: PASSED (0.12s) RESULTS - rpm.RpmBasicTest.test_rpm_help: PASSED (0.11s) RESULTS - rpm.RpmBasicTest.test_rpm_query: PASSED (0.18s) RESULTS - rpm.RpmBasicTest.test_rpm_query_nonroot: PASSED (0.89s) RESULTS - rpm.RpmInstallRemoveTest.test_check_rpm_install_removal_log_file_size: PASSED (3.54s) RESULTS - rpm.RpmInstallRemoveTest.test_rpm_install: PASSED (0.37s) RESULTS - rpm.RpmInstallRemoveTest.test_rpm_remove: PASSED (0.15s) RESULTS - scp.ScpTest.test_scp_file: PASSED (0.41s) RESULTS - ssh.SSHTest.test_ssh: PASSED (0.51s) RESULTS - systemd.SystemdBasicTests.test_systemd_basic: PASSED (0.12s) RESULTS - systemd.SystemdBasicTests.test_systemd_failed: PASSED (0.21s) RESULTS - systemd.SystemdBasicTests.test_systemd_list: PASSED (0.44s) RESULTS - systemd.SystemdJournalTests.test_systemd_boot_time: PASSED (0.11s) RESULTS -
Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote: > > On 8/27/19 5:58 PM, Richard Purdie wrote: > > Hi Jason, > > Somehow this change is responsible for this build failure: > > > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 > > > > (steps 5c and 7c so failure during testimage). > > > > I have bisected it to this change, I haven't looked into why. > > Thanks for tracking it down. I am sure how to try an duplicate > this. I clicked around to try and find out a bit about what it is > running for these phases of the build but it is not very obvious. > > Is there a local.conf I can try along with what ever commands it ran? The configuration its using is shown in https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio for each step. Cheers, Richard -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On 8/27/19 5:58 PM, Richard Purdie wrote: Hi Jason, Somehow this change is responsible for this build failure: https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 (steps 5c and 7c so failure during testimage). I have bisected it to this change, I haven't looked into why. Thanks for tracking it down. I am sure how to try an duplicate this. I clicked around to try and find out a bit about what it is running for these phases of the build but it is not very obvious. Is there a local.conf I can try along with what ever commands it ran? Jason. -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
On Tue, 2019-08-20 at 17:27 -0700, Jason Wessel wrote: > Some BSPs use a USB serial port which may or may not actually be > plugged all the time. It is quite useful to have a USB serial port > have a getty running but it does not make sense to wait for it for 90 > seconds before completing the system startup if it might never get > plugged in. The typical example is that a USB serial device might > only need to be plugged in when debugging, upgrading, or initially > configuring a device. > > This change is somewhat subtle. Systemd uses the "BindsTo" directive > to ensure existence of the device in order to start the service as > well as to terminate the service if the device goes away. The > "After" > directive makes that same relationship stronger, and has the > undesired > side effect that systemd will wait until its internal time out value > for the device to come on line before executing a fail operation or > letting other tasks and groups continue. This is certainly the kind > of behavior we want for a disk, but not for serial ports in general. > > The kernel module loader and device detection will have run a long > time before the getty startup. By the time the getty startup occurs > the system has all the serial devices its going to get. > > If you want to observe the problem with qemu, it is easy to > replicate. > Simply add the following line to your local.conf for a x86-64 qemu > build. > > SERIAL_CONSOLES="115200;ttyS0 115200;ttyUSB0" > > Login right after the system boots and observe: > >root@qemux86-64:~# systemctl list-jobs |cat >JOB UNIT TYPE STATE > 1 multi-user.targetstart waiting > 69 serial-getty@ttyUSB0.service start waiting > 64 getty.target start waiting > 71 dev-ttyUSB0.device start running > 62 systemd-update-utmp-runlevel.service start waiting > >5 jobs listed. > > You can see above that the dev-ttyUSB0.device will block for 1min 30 > seconds. While that might not be a problem for this reference build. > It is certainly a problem for images that have software watchdogs > that > verify the system booted up all the way to systemd completion in less > than 90 seconds. > > This other nice effect of this change is that the fast fail device > extend to additional serial ports that may not exist on ARM BSPs or > that might be configured in or out by the dtb files on different > boards. > > Signed-off-by: Jason Wessel > --- > .../systemd/systemd-serialgetty/serial-getty@.service | 2 > +- > 1 file changed, 1 insertion(+), 1 deletion(-) Hi Jason, Somehow this change is responsible for this build failure: https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976 (steps 5c and 7c so failure during testimage). I have bisected it to this change, I haven't looked into why. Cheers, Richard -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core
[OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist
Some BSPs use a USB serial port which may or may not actually be plugged all the time. It is quite useful to have a USB serial port have a getty running but it does not make sense to wait for it for 90 seconds before completing the system startup if it might never get plugged in. The typical example is that a USB serial device might only need to be plugged in when debugging, upgrading, or initially configuring a device. This change is somewhat subtle. Systemd uses the "BindsTo" directive to ensure existence of the device in order to start the service as well as to terminate the service if the device goes away. The "After" directive makes that same relationship stronger, and has the undesired side effect that systemd will wait until its internal time out value for the device to come on line before executing a fail operation or letting other tasks and groups continue. This is certainly the kind of behavior we want for a disk, but not for serial ports in general. The kernel module loader and device detection will have run a long time before the getty startup. By the time the getty startup occurs the system has all the serial devices its going to get. If you want to observe the problem with qemu, it is easy to replicate. Simply add the following line to your local.conf for a x86-64 qemu build. SERIAL_CONSOLES="115200;ttyS0 115200;ttyUSB0" Login right after the system boots and observe: root@qemux86-64:~# systemctl list-jobs |cat JOB UNIT TYPE STATE 1 multi-user.targetstart waiting 69 serial-getty@ttyUSB0.service start waiting 64 getty.target start waiting 71 dev-ttyUSB0.device start running 62 systemd-update-utmp-runlevel.service start waiting 5 jobs listed. You can see above that the dev-ttyUSB0.device will block for 1min 30 seconds. While that might not be a problem for this reference build. It is certainly a problem for images that have software watchdogs that verify the system booted up all the way to systemd completion in less than 90 seconds. This other nice effect of this change is that the fast fail device extend to additional serial ports that may not exist on ARM BSPs or that might be configured in or out by the dtb files on different boards. Signed-off-by: Jason Wessel --- .../systemd/systemd-serialgetty/serial-getty@.service | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service b/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service index e8b027e97d..a20092a173 100644 --- a/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service +++ b/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service @@ -10,7 +10,7 @@ Description=Serial Getty on %I Documentation=man:agetty(8) man:systemd-getty-generator(8) Documentation=http://0pointer.de/blog/projects/serial-console.html BindsTo=dev-%i.device -After=dev-%i.device systemd-user-sessions.service plymouth-quit-wait.service +After=systemd-user-sessions.service plymouth-quit-wait.service After=rc-local.service # If additional gettys are spawned during boot then we should make -- 2.21.0 -- ___ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core