On Wed, Nov 16, 2016 at 5:55 AM, Brian Starkey <[email protected]> wrote: > Hi, > > I'm running an ARM FVP (virtual platform - simluated hardware), which > is failing to reach a login prompt due to extremely slow progress > during boot. systemd gives up waiting for the ttyAMA0 device to > appear, and never starts the getty. > > I've bisected this to commit 4cd13c21b207 "softirq: Let ksoftirqd do > its job". > > Without this commit, the system boots to a login prompt in 2 minutes. > With this commit, the system eventually manages to bring up sshd after > 22 minutes, but as mentioned, the dev-ttyAMA0.device unit has timed > out and so I don't get a prompt on my console. > > I only hit the issue when my rootfs is mounted over NFS, and with only > a single core enabled. The (simulated) network device is an SMC91C111. > With multiple cores enabled or a non-NFS filesystem, everything seems > to work OK. > > I don't have an identical real hardware platform to try, but I > could not reproduce it on a real ARM Juno board, which is similar. > > It looks from the logs that udev's workers are unable to make > progress, so the device nodes don't get created. Don't pay too much > attention to the timestamps in the logs below, they are "inside" the > virtual platform, and don't reflect wall-clock time. > Log before 4cd13c21b207: > https://drive.google.com/open?id=0B8siaK6ZjvEwMktoa0NUS2hJd1U > Log after 4cd13c21b207: > https://drive.google.com/open?id=0B8siaK6ZjvEwZXlfeFFSQl9xZTQ > Kernel config: arch/arm64/configs/defconfig > > I'm not sure how to debug this further, so if you have any suggestions > I'd be glad to hear them. > > Many thanks, > Brian >
Hi Brian. Thanks a lot for this report. If issue triggers when/if using one core, it is possible one driver has a dependency on softirqs being serviced during an initialization loop. If the thread is not yielding cpu (holding something like a spinlock thus disabling preemption), then ksoftirqd might not be able to run on the (same) cpu. I sent a patch for busy polling yesterday, but I am almost certain this would not fix your issue (assuming you have CONFIG_PREEMPT) https://patchwork.ozlabs.org/patch/695185/

