Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-28 Thread Jason Wessel

On 8/28/19 5:41 PM, Richard Purdie wrote:

On Wed, 2019-08-28 at 15:24 -0500, Jason Wessel wrote:

On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote:

On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote:

On 8/27/19 5:58 PM, Richard Purdie wrote:

Hi Jason,
Somehow this change is responsible for this build failure:

https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976

(steps 5c and 7c so failure during testimage).

I have bisected it to this change, I haven't looked into why.

Thanks for tracking it down.   I am sure how to try an duplicate
this.  I clicked around to try and find out a bit about what it
is
running for these phases of the build but it is not very obvious.

Is there a local.conf I can try along with what ever commands it
ran?

The configuration its using is shown in
https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio
for each step.


I tried the configuration on my system but it seems to work fine, see
below.   Is there a way to get complete log file of the boot on the
broken host?  I am also curious if it fails every time or not.

/home/pokybuild/yocto-worker/qa-
extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image-
sato/1.0-r0/testimage/qemu_boot_log.2019082722

I am not sure it will tell us anything further or not, but it
certainly looks like the system didn't boot correctly in the first
place.

I've rerun it as
https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/983




I'll take a further look at it tomorrow.   I have no idea why systemd emitted 
this message:

[    4.750194] systemd[1]: Unnecessary job for /dev/ttyS1 was removed.
[    4.751666] systemd[1]: Unnecessary job for /dev/ttyS0 was removed.

[  OK  ] Started Serial Getty on ttyS0.
 Stopping Serial Getty on ttyS0...

I have an image now with the problem so, I'll make a copy of it in case the problem 
"goes away" again.  It will be interesting to see what the deal is with this 
corner case.

Cheers,

Jason.




--
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-28 Thread Richard Purdie
On Wed, 2019-08-28 at 15:24 -0500, Jason Wessel wrote:
> On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote:
> > On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote:
> > > On 8/27/19 5:58 PM, Richard Purdie wrote:
> > > > Hi Jason,
> > > > Somehow this change is responsible for this build failure:
> > > > 
> > > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976
> > > > 
> > > > (steps 5c and 7c so failure during testimage).
> > > > 
> > > > I have bisected it to this change, I haven't looked into why.
> > > 
> > > Thanks for tracking it down.   I am sure how to try an duplicate
> > > this.  I clicked around to try and find out a bit about what it
> > > is
> > > running for these phases of the build but it is not very obvious.
> > > 
> > > Is there a local.conf I can try along with what ever commands it
> > > ran?
> > 
> > The configuration its using is shown in
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio
> > for each step.
> > 
> 
> I tried the configuration on my system but it seems to work fine, see
> below.   Is there a way to get complete log file of the boot on the
> broken host?  I am also curious if it fails every time or not.
> 
> /home/pokybuild/yocto-worker/qa-
> extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image-
> sato/1.0-r0/testimage/qemu_boot_log.2019082722
> 
> I am not sure it will tell us anything further or not, but it
> certainly looks like the system didn't boot correctly in the first
> place.

I've rerun it as 
https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/983

Cheers,

Richard

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-28 Thread richard . purdie
On Wed, 2019-08-28 at 15:24 -0500, Jason Wessel wrote:
> On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote:
> > On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote:
> > > On 8/27/19 5:58 PM, Richard Purdie wrote:
> > > > Hi Jason,
> > > > Somehow this change is responsible for this build failure:
> > > > 
> > > > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976
> > > > 
> > > > (steps 5c and 7c so failure during testimage).
> > > > 
> > > > I have bisected it to this change, I haven't looked into why.
> > > 
> > > Thanks for tracking it down.   I am sure how to try an duplicate
> > > this.  I clicked around to try and find out a bit about what it
> > > is
> > > running for these phases of the build but it is not very obvious.
> > > 
> > > Is there a local.conf I can try along with what ever commands it
> > > ran?
> > 
> > The configuration its using is shown in
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio
> > for each step.
> > 
> 
> I tried the configuration on my system but it seems to work fine, see
> below.   Is there a way to get complete log file of the boot on the
> broken host?  I am also curious if it fails every time or not.
> 
> /home/pokybuild/yocto-worker/qa-
> extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image-
> sato/1.0-r0/testimage/qemu_boot_log.2019082722

If we'd got this straight after the build then yes but its been
recycled now.

It did seem to fail repeatedly when I tested it. I'd probably have to
add the patch back and retest again to get the log.

> I am not sure it will tell us anything further or not, but it
> certainly looks like the system didn't boot correctly in the first
> place.

It does seem like a boot issue somehow...

The test case one the autobuilder seems simple enough, systemd boot
alone with no sysvinit.

Cheers,

Richard

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-28 Thread Jason Wessel

On 8/27/19 7:15 PM, richard.pur...@linuxfoundation.org wrote:

On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote:


On 8/27/19 5:58 PM, Richard Purdie wrote:

Hi Jason,
Somehow this change is responsible for this build failure:

https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976

(steps 5c and 7c so failure during testimage).

I have bisected it to this change, I haven't looked into why.


Thanks for tracking it down.   I am sure how to try an duplicate
this.  I clicked around to try and find out a bit about what it is
running for these phases of the build but it is not very obvious.

Is there a local.conf I can try along with what ever commands it ran?


The configuration its using is shown in
https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio
for each step.




I tried the configuration on my system but it seems to work fine, see below.   
Is there a way to get complete log file of the boot on the broken host?  I am 
also curious if it fails every time or not.

/home/pokybuild/yocto-worker/qa-extras2/build/build/tmp/work/qemux86_64-poky-linux/core-image-sato/1.0-r0/testimage/qemu_boot_log.2019082722

I am not sure it will tell us anything further or not, but it certainly looks 
like the system didn't boot correctly in the first place.


Jason.





Initialising tasks: 100% 
|###
Sstate summary: Wanted 0 Found 0 Missed 0 Current 90 (0% match, 100% complete)
NOTE: Executing Tasks
NOTE: Setscene tasks completed
Started HTTPService on 128.224.149.46:38317
Test requires apt to be installed
Stopped HTTPService on 128.224.149.46:38317
Test requires autoconf to be installed
Test requires gtk+3 to be installed
Test requires autoconf to be installed
Started HTTPService on 128.224.149.46:32961
Stopped HTTPService on 128.224.149.46:32961
Test requires gcc to be installed
Test requires g++ to be installed
Test requires g++ to be installed
Test requires make to be installed
Test requires python3-pygobject to be installed
Test requires gcc to be installed
Test requires ldd to be installed
Test requires logrotate to be installed
Test case logrotate.LogrotateTest.test_2_logrotate depends on 
logrotate.LogrotateTest.test_1_logrotate_setup but it didn'
Not appropiate for systemd image
Test requires opkg to be installed
Test requires pam to be in DISTRO_FEATURES
Test requires ptest-runner to be installed
Test requires systemtap to be installed
Startup finished in 3.840s (kernel) + 2.157s (userspace) = 5.998s.
RESULTS:
RESULTS - connman.ConnmanTest.test_connmand_help: PASSED (0.11s)
RESULTS - connman.ConnmanTest.test_connmand_running: PASSED (0.11s)
RESULTS - date.DateTest.test_date: PASSED (0.63s)
RESULTS - df.DfTest.test_df: PASSED (0.11s)
RESULTS - dnf.DnfBasicTest.test_dnf_help: PASSED (0.78s)
RESULTS - dnf.DnfBasicTest.test_dnf_history: PASSED (0.44s)
RESULTS - dnf.DnfBasicTest.test_dnf_info: PASSED (0.37s)
RESULTS - dnf.DnfBasicTest.test_dnf_search: PASSED (0.35s)
RESULTS - dnf.DnfBasicTest.test_dnf_version: PASSED (0.32s)
RESULTS - dnf.DnfRepoTest.test_dnf_exclude: PASSED (5.03s)
RESULTS - dnf.DnfRepoTest.test_dnf_install: PASSED (0.78s)
RESULTS - dnf.DnfRepoTest.test_dnf_install_dependency: PASSED (1.77s)
RESULTS - dnf.DnfRepoTest.test_dnf_install_from_disk: PASSED (1.97s)
RESULTS - dnf.DnfRepoTest.test_dnf_install_from_http: PASSED (1.52s)
RESULTS - dnf.DnfRepoTest.test_dnf_installroot: PASSED (6.91s)
RESULTS - dnf.DnfRepoTest.test_dnf_makecache: PASSED (0.42s)
RESULTS - dnf.DnfRepoTest.test_dnf_reinstall: PASSED (0.61s)
RESULTS - dnf.DnfRepoTest.test_dnf_repoinfo: PASSED (0.35s)
RESULTS - oe_syslog.SyslogTest.test_syslog_running: PASSED (0.12s)
RESULTS - oe_syslog.SyslogTestConfig.test_syslog_logger: PASSED (1.21s)
RESULTS - oe_syslog.SyslogTestConfig.test_syslog_restart: PASSED (0.21s)
RESULTS - parselogs.ParseLogsTest.test_parselogs: PASSED (1.91s)
RESULTS - perl.PerlTest.test_perl_works: PASSED (0.11s)
RESULTS - ping.PingTest.test_ping: PASSED (0.06s)
RESULTS - python.PythonTest.test_python3: PASSED (0.12s)
RESULTS - rpm.RpmBasicTest.test_rpm_help: PASSED (0.11s)
RESULTS - rpm.RpmBasicTest.test_rpm_query: PASSED (0.18s)
RESULTS - rpm.RpmBasicTest.test_rpm_query_nonroot: PASSED (0.89s)
RESULTS - 
rpm.RpmInstallRemoveTest.test_check_rpm_install_removal_log_file_size: PASSED 
(3.54s)
RESULTS - rpm.RpmInstallRemoveTest.test_rpm_install: PASSED (0.37s)
RESULTS - rpm.RpmInstallRemoveTest.test_rpm_remove: PASSED (0.15s)
RESULTS - scp.ScpTest.test_scp_file: PASSED (0.41s)
RESULTS - ssh.SSHTest.test_ssh: PASSED (0.51s)
RESULTS - systemd.SystemdBasicTests.test_systemd_basic: PASSED (0.12s)
RESULTS - systemd.SystemdBasicTests.test_systemd_failed: PASSED (0.21s)
RESULTS - systemd.SystemdBasicTests.test_systemd_list: PASSED (0.44s)
RESULTS - systemd.SystemdJournalTests.test_systemd_boot_time: PASSED (0.11s)
RESULTS - 

Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-27 Thread richard . purdie
On Tue, 2019-08-27 at 19:03 -0500, Jason Wessel wrote:
> 
> On 8/27/19 5:58 PM, Richard Purdie wrote:
> > Hi Jason,
> > Somehow this change is responsible for this build failure:
> > 
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976
> > 
> > (steps 5c and 7c so failure during testimage).
> > 
> > I have bisected it to this change, I haven't looked into why.
> 
> Thanks for tracking it down.   I am sure how to try an duplicate
> this.  I clicked around to try and find out a bit about what it is
> running for these phases of the build but it is not very obvious.
> 
> Is there a local.conf I can try along with what ever commands it ran?

The configuration its using is shown in 
https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/978/steps/8/logs/stdio
for each step.

Cheers,

Richard

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-27 Thread Jason Wessel



On 8/27/19 5:58 PM, Richard Purdie wrote:

Hi Jason,
Somehow this change is responsible for this build failure:

https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976

(steps 5c and 7c so failure during testimage).

I have bisected it to this change, I haven't looked into why.



Thanks for tracking it down.   I am sure how to try an duplicate this.  I 
clicked around to try and find out a bit about what it is running for these 
phases of the build but it is not very obvious.

Is there a local.conf I can try along with what ever commands it ran?


Jason.

--
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-27 Thread Richard Purdie
On Tue, 2019-08-20 at 17:27 -0700, Jason Wessel wrote:
> Some BSPs use a USB serial port which may or may not actually be
> plugged all the time.  It is quite useful to have a USB serial port
> have a getty running but it does not make sense to wait for it for 90
> seconds before completing the system startup if it might never get
> plugged in.  The typical example is that a USB serial device might
> only need to be plugged in when debugging, upgrading, or initially
> configuring a device.
> 
> This change is somewhat subtle.  Systemd uses the "BindsTo" directive
> to ensure existence of the device in order to start the service as
> well as to terminate the service if the device goes away.  The
> "After"
> directive makes that same relationship stronger, and has the
> undesired
> side effect that systemd will wait until its internal time out value
> for the device to come on line before executing a fail operation or
> letting other tasks and groups continue.  This is certainly the kind
> of behavior we want for a disk, but not for serial ports in general.
> 
> The kernel module loader and device detection will have run a long
> time before the getty startup.  By the time the getty startup occurs
> the system has all the serial devices its going to get.
> 
> If you want to observe the problem with qemu, it is easy to
> replicate.
> Simply add the following line to your local.conf for a x86-64 qemu
> build.
> 
> SERIAL_CONSOLES="115200;ttyS0 115200;ttyUSB0"
> 
> Login right after the system boots and observe:
> 
>root@qemux86-64:~# systemctl list-jobs |cat
>JOB UNIT TYPE  STATE
>  1 multi-user.targetstart waiting
> 69 serial-getty@ttyUSB0.service start waiting
> 64 getty.target start waiting
> 71 dev-ttyUSB0.device   start running
> 62 systemd-update-utmp-runlevel.service start waiting
> 
>5 jobs listed.
> 
> You can see above that the dev-ttyUSB0.device will block for 1min 30
> seconds.  While that might not be a problem for this reference build.
> It is certainly a problem for images that have software watchdogs
> that
> verify the system booted up all the way to systemd completion in less
> than 90 seconds.
> 
> This other nice effect of this change is that the fast fail device
> extend to additional serial ports that may not exist on ARM BSPs or
> that might be configured in or out by the dtb files on different
> boards.
> 
> Signed-off-by: Jason Wessel 
> ---
>  .../systemd/systemd-serialgetty/serial-getty@.service   | 2
> +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Hi Jason,

Somehow this change is responsible for this build failure:

https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/976

(steps 5c and 7c so failure during testimage).

I have bisected it to this change, I haven't looked into why.

Cheers,

Richard

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


[OE-core] [PATCH] serial-getty@.service: Allow device to fast fail if it does not exist

2019-08-20 Thread Jason Wessel
Some BSPs use a USB serial port which may or may not actually be
plugged all the time.  It is quite useful to have a USB serial port
have a getty running but it does not make sense to wait for it for 90
seconds before completing the system startup if it might never get
plugged in.  The typical example is that a USB serial device might
only need to be plugged in when debugging, upgrading, or initially
configuring a device.

This change is somewhat subtle.  Systemd uses the "BindsTo" directive
to ensure existence of the device in order to start the service as
well as to terminate the service if the device goes away.  The "After"
directive makes that same relationship stronger, and has the undesired
side effect that systemd will wait until its internal time out value
for the device to come on line before executing a fail operation or
letting other tasks and groups continue.  This is certainly the kind
of behavior we want for a disk, but not for serial ports in general.

The kernel module loader and device detection will have run a long
time before the getty startup.  By the time the getty startup occurs
the system has all the serial devices its going to get.

If you want to observe the problem with qemu, it is easy to replicate.
Simply add the following line to your local.conf for a x86-64 qemu
build.

SERIAL_CONSOLES="115200;ttyS0 115200;ttyUSB0"

Login right after the system boots and observe:

   root@qemux86-64:~# systemctl list-jobs |cat
   JOB UNIT TYPE  STATE
 1 multi-user.targetstart waiting
69 serial-getty@ttyUSB0.service start waiting
64 getty.target start waiting
71 dev-ttyUSB0.device   start running
62 systemd-update-utmp-runlevel.service start waiting

   5 jobs listed.

You can see above that the dev-ttyUSB0.device will block for 1min 30
seconds.  While that might not be a problem for this reference build.
It is certainly a problem for images that have software watchdogs that
verify the system booted up all the way to systemd completion in less
than 90 seconds.

This other nice effect of this change is that the fast fail device
extend to additional serial ports that may not exist on ARM BSPs or
that might be configured in or out by the dtb files on different
boards.

Signed-off-by: Jason Wessel 
---
 .../systemd/systemd-serialgetty/serial-getty@.service   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service 
b/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service
index e8b027e97d..a20092a173 100644
--- a/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service
+++ b/meta/recipes-core/systemd/systemd-serialgetty/serial-getty@.service
@@ -10,7 +10,7 @@ Description=Serial Getty on %I
 Documentation=man:agetty(8) man:systemd-getty-generator(8)
 Documentation=http://0pointer.de/blog/projects/serial-console.html
 BindsTo=dev-%i.device
-After=dev-%i.device systemd-user-sessions.service plymouth-quit-wait.service
+After=systemd-user-sessions.service plymouth-quit-wait.service
 After=rc-local.service
 
 # If additional gettys are spawned during boot then we should make
-- 
2.21.0

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core