Re: [systemd-devel] Please help: timeout waiting for /dev/tty* console device

2023-01-09 Thread Gabriel L. Somlo
On Mon, Jan 09, 2023 at 01:37:58PM +0100, Lennart Poettering wrote:
> On Fr, 06.01.23 19:15, Gabriel L. Somlo (gso...@gmail.com) wrote:
> 
> > Hi,
> >
> > I'm failing to get a login prompt on the serial console of my system,
> > because a few steps earlier serial-getty@.service fails due to a
> > dependency on the actual tty device, which times out:
> >
> > [ TIME ] Timed out waiting for device …ttyLXU0.device - /dev/ttyLXU0.
> > [DEPEND] Dependency failed for seri…ice - Serial Getty on ttyLXU0.
> 
> So you are saying that the device actually *does* pop up eventually,
> but your systems is simply so awfully slow that the default time-outs
> are hit?

Yeah, at 50MHz and 512MB RAM, it's just about the slowest, most
resource-constrained thing that I'm willing to call a "computer",
by virtue of its actually being able to (barely) boot Fedora's
riscv f37 port... :)

> If so, you solve this locally for dev-ttyLXU0.device by adding a
> JobTimeoutSec= drop-in file (for the [Unit]) section.
> 
> Or if you want to increase the time-out globally, consider setting
> DefaultTimeoutStartSec= in /etc/systemd/system.conf to any value you
> like.

I went with "systemd.default_timeout_start_sec=360s" (as also
suggested by Martin elsewhere in this thread). That actually fixed
the tty issue, and also a bunch of other services that were failing
earlier...

Thanks again for the quick replies!

--Gabriel


Re: [systemd-devel] Please help: timeout waiting for /dev/tty* console device

2023-01-06 Thread Gabriel L. Somlo
On Fri, Jan 06, 2023 at 07:15:14PM -0500, Gabriel L. Somlo wrote:
> Hi,
> 
> I'm failing to get a login prompt on the serial console of my system,
> because a few steps earlier serial-getty@.service fails due to a
> dependency on the actual tty device, which times out:
> 
> [ TIME ] Timed out waiting for device …ttyLXU0.device - /dev/ttyLXU0.
> [DEPEND] Dependency failed for seri…ice - Serial Getty on ttyLXU0.
> 
> This eventually results in "Failed to start systemd-logind.service",
> and no login prompt on the serial console.
> 
> I'm trying get a riscv64 port of fedora (systemd version 251) running
> on a system that can be considered "exotic" and rather on the slow
> side -- it's an FPGA soft-cpu system using the RocketChip running at
> 50MHz.
> 
> I got it working successfully on an older version of the fedora-riscv
> port, f33, using systemd 246.
> 
> I can also get it working with the current systemd on a 4-core Rocket
> chip deployed on a large enough FPGA.
> 
> It (udev?) times out on the single-core version of the design, and I'm
> wondering if there are any available options to get systemd and/or
> udev to be more "patient".
> 
> I tried booting with `udev.children_max=1` and `udev.event_timeout=800`
> on the kernel command line.
> 
> I also tried modifying systemd-udev-settle.service like so:
>   TimeoutSec=720
>   ExecStart=udevadm settle --timeout=680
> 
> None of the above seem to help on the single-core 50MHz rocket-chip
> system (and are not needed on the 4-core version).
> 
> Any other tricks I can use to force it to wait (or to otherwise
> encourage it to find, sooner) /dev/ttyLXU0?
> 
> I can't log into this machine to run any tests. It does manage connect
> to the network, start NetworkManager.service and sshd.service, but any
> attempt to ssh in over the network fails (after successful auth) with
> "conection closed by  port 22". Not sure if due to the
> user login service having failed, or some other unrelated timeout.

Turns out it was just really slow loading bash :) After a while, I did
manage to log in and obtain a root shell over ssh. From there I did:

udevadm trigger --type=devices --action=add
udevadm trigger --type=subsystems --action=add

which had no visible effect. After that, I did:

systemctl restart serial-getty@ttyLXU0.service

which resulted in a login prompt showing up on the serial console!

I *can* run any tests y'all might suggest to further debug the state
of the system. But at this point I really do believe there is (or
should be :) a way to extend the timeout during initial boot to force
the system to wait for /dev/ttyLXU0 to become available (via udev?).

Thanks much,
--Gabriel


[systemd-devel] Please help: timeout waiting for /dev/tty* console device

2023-01-06 Thread Gabriel L. Somlo
Hi,

I'm failing to get a login prompt on the serial console of my system,
because a few steps earlier serial-getty@.service fails due to a
dependency on the actual tty device, which times out:

[ TIME ] Timed out waiting for device …ttyLXU0.device - /dev/ttyLXU0.
[DEPEND] Dependency failed for seri…ice - Serial Getty on ttyLXU0.

This eventually results in "Failed to start systemd-logind.service",
and no login prompt on the serial console.

I'm trying get a riscv64 port of fedora (systemd version 251) running
on a system that can be considered "exotic" and rather on the slow
side -- it's an FPGA soft-cpu system using the RocketChip running at
50MHz.

I got it working successfully on an older version of the fedora-riscv
port, f33, using systemd 246.

I can also get it working with the current systemd on a 4-core Rocket
chip deployed on a large enough FPGA.

It (udev?) times out on the single-core version of the design, and I'm
wondering if there are any available options to get systemd and/or
udev to be more "patient".

I tried booting with `udev.children_max=1` and `udev.event_timeout=800`
on the kernel command line.

I also tried modifying systemd-udev-settle.service like so:
TimeoutSec=720
ExecStart=udevadm settle --timeout=680

None of the above seem to help on the single-core 50MHz rocket-chip
system (and are not needed on the 4-core version).

Any other tricks I can use to force it to wait (or to otherwise
encourage it to find, sooner) /dev/ttyLXU0?

I can't log into this machine to run any tests. It does manage connect
to the network, start NetworkManager.service and sshd.service, but any
attempt to ssh in over the network fails (after successful auth) with
"conection closed by  port 22". Not sure if due to the
user login service having failed, or some other unrelated timeout.

I can (also) ssh into the 4-core version of the system, and can run
tests and report results from that one if it would help troubleshoot
the issue (the systems are otherwise 100% identical in terms of
"hardware" -- same HDL sources -- and only differ by the core count of
the CPU module).

Any help and/or ideas much appreciated, thanks in advance!

Best,
--Gabriel