bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-22 Thread Leo Famulari
On Wed, Dec 22, 2021 at 04:16:30PM -0500, Leo Famulari wrote:
> On Tue, Dec 21, 2021 at 09:19:20PM -0500, Leo Famulari wrote:
> > Would someone like to write a proper commit message and add a code
> > comment?
> 
> How about the attached patch? I'd like to push this soon, because it's a
> severe problem for some users.

> From ab9c4de76dea889e5d05bcf7fa173868357d5f44 Mon Sep 17 00:00:00 2001
> From: Timothy Sample 
> Date: Tue, 21 Dec 2021 11:52:34 -0500
> Subject: [PATCH] services: dbus: Wait 1 minute for elogind to get ready.
> 
> Fixes .
> 
> * gnu/services/dbus.scm (dbus-configuration-directory): Set a 60 second 
> timeout
> in the dbus config.

Pushed as 488f1c589df00e802163af534294d93372e5c025


signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-22 Thread Leo Famulari
On Tue, Dec 21, 2021 at 09:19:20PM -0500, Leo Famulari wrote:
> Would someone like to write a proper commit message and add a code
> comment?

How about the attached patch? I'd like to push this soon, because it's a
severe problem for some users.
From ab9c4de76dea889e5d05bcf7fa173868357d5f44 Mon Sep 17 00:00:00 2001
From: Timothy Sample 
Date: Tue, 21 Dec 2021 11:52:34 -0500
Subject: [PATCH] services: dbus: Wait 1 minute for elogind to get ready.

Fixes .

* gnu/services/dbus.scm (dbus-configuration-directory): Set a 60 second timeout
in the dbus config.
---
 gnu/services/dbus.scm | 4 
 1 file changed, 4 insertions(+)

diff --git a/gnu/services/dbus.scm b/gnu/services/dbus.scm
index 85a4c3ec9a..d2daf60497 100644
--- a/gnu/services/dbus.scm
+++ b/gnu/services/dbus.scm
@@ -106,6 +106,10 @@ (define-syntax directives
 (define (services->sxml services)
   ;; Return the SXML 'includedir' clauses for DIRS.
   `(busconfig
+ ;; Increase this timeout to 60 seconds to work around race-y
+ ;; failures such as  on slow
+ ;; computers with slow I/O.
+(limit (@ (name "auth_timeout")) "6")
 (servicehelper "/run/setuid-programs/dbus-daemon-launch-helper")
 
 ;; First, the '.service' files of services subject to activation.
-- 
2.34.0



signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Leo Famulari
On Wed, Dec 22, 2021 at 01:46:41AM +, Luis Felipe via Bug reports for GNU 
Guix wrote:
> Hello,
> 
> Just wanted to inform that I ran
> 
> `guix pull --branch=wip-fix-52051 && guix system reconfigure [...]`
> 
> and I could log in without problems now. Thanks.

Awesome!

Would someone like to write a proper commit message and add a code
comment?


signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Luis Felipe via Bug reports for GNU Guix
Hello,

Just wanted to inform that I ran

`guix pull --branch=wip-fix-52051 && guix system reconfigure [...]`

and I could log in without problems now. Thanks.

publickey - luis.felipe.la@protonmail.com - 0x12DE1598.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Leo Famulari
On Tue, Dec 21, 2021 at 05:51:28PM +, Maxime Devos wrote:
> I don't think commenting new code is a drastic and extreme action.
> I assume you were referring you were referring to the ‘16.6 hours’?

Right. We don't want to use a timeout of 16.6 hours. From the
perspective of a user, that's not meaningfully longer than, say, 30
minutes. And really, if the system can't bring up the login interface
for a laptop in 5 minutes, it's totally broken.

> If so, just replace ‘16.6 hours’ by ‘one minute’, because it turns
> out that 'auth_timeout' is in milliseconds and not seconds,
> see .

I think I was confused about the units here.





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Maxime Devos
Leo Famulari schreef op di 21-12-2021 om 12:32 [-0500]:
> > A comment like 
> > 
> > > ;; Set timeout to a huge number (16.6 hours), because
> > > ;; upstream often sets timeouts low for spinning disks,
> > > ;; slow CPUs, etc.
> > > (limit [...] "6")
> > 
> > could be useful (I'm assuming the timeout is in seconds here).
> 
> I suggest we wait until such drastic action is necessary. Otherwise
> we
> might be banging our heads against the wall in a few years, trying to
> debug something. Let's not rush to extremes :)

I don't think commenting new code is a drastic and extreme action.
I assume you were referring you were referring to the ‘16.6 hours’?
If so, just replace ‘16.6 hours’ by ‘one minute’, because it turns
out that 'auth_timeout' is in milliseconds and not seconds,
see .

Greetings,
Maxime.







bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Leo Famulari
On Tue, Dec 21, 2021 at 12:37:30PM -0500, Leo Famulari wrote:
> With this patch, I can boot and login to X on my X200s with HDD.
> Obviously this failure and subsequent fix are not deterministic, but
> it's a good sign! Let's get some more testing!

I pushed the patch to a Savannah branch to ease testing:

https://git.savannah.gnu.org/cgit/guix.git/log/?h=wip-fix-52051

So, you can do:

`guix pull --branch=wip-fix-52051 && guix system reconfigure [...]`

Or similar with time-machine


signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Leo Famulari
On Tue, Dec 21, 2021 at 11:36:22AM -0500, Timothy Sample wrote:
> diff --git a/gnu/services/dbus.scm b/gnu/services/dbus.scm
> index 85a4c3e..a680ed7 100644
> --- a/gnu/services/dbus.scm
> +++ b/gnu/services/dbus.scm
> @@ -106,6 +106,7 @@ (define-syntax directives
>  (define (services->sxml services)
>;; Return the SXML 'includedir' clauses for DIRS.
>`(busconfig
> +(limit (@ (name "auth_timeout")) "6")
>  (servicehelper "/run/setuid-programs/dbus-daemon-launch-helper")
>  
>  ;; First, the '.service' files of services subject to activation.


With this patch, I can boot and login to X on my X200s with HDD.
Obviously this failure and subsequent fix are not deterministic, but
it's a good sign! Let's get some more testing! And comrades, let's test
the xorg-server update while doing it:

https://issues.guix.gnu.org/issue/52562





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Leo Famulari
On Tue, Dec 21, 2021 at 04:52:19PM +, Maxime Devos wrote:
> Are there any good reasons for having a timeout at all?
> (Except for the local-user denial of service, but local users can do
> "guix build -f something-that-allocates-almost-all-memory-and-melts-
> the-cpu.scm" anyway ...)
> 
> If not, can the timeout be disabled/set to infinity?

Could be. But I think that, based on a several years of reports, the
X200 with an HDD is the slowest machine used with Guix System. On my
X200 with HDD, I have personally experienced similar race-y bugs that
seem to crop up after major upgrades --- I assume that it's a case of
bad luck, where important programs for booting move to distant parts of
the disk and seeking is too slow.

> A comment like 
> 
> > ;; Set timeout to a huge number (16.6 hours), because
> > ;; upstream often sets timeouts low for spinning disks,
> > ;; slow CPUs, etc.
> > (limit [...] "6")
> 
> could be useful (I'm assuming the timeout is in seconds here).

I suggest we wait until such drastic action is necessary. Otherwise we
might be banging our heads against the wall in a few years, trying to
debug something. Let's not rush to extremes :)





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Maxime Devos
Timothy Sample schreef op di 21-12-2021 om 11:36 [-0500]:
> +    (limit (@ (name "auth_timeout")) "6")


Are there any good reasons for having a timeout at all?
(Except for the local-user denial of service, but local users can do
"guix build -f something-that-allocates-almost-all-memory-and-melts-
the-cpu.scm" anyway ...)

If not, can the timeout be disabled/set to infinity?

A comment like 

> ;; Set timeout to a huge number (16.6 hours), because
> ;; upstream often sets timeouts low for spinning disks,
> ;; slow CPUs, etc.
> (limit [...] "6")

could be useful (I'm assuming the timeout is in seconds here).

Greetings,
Maxime.






bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Timothy Sample
Hi Leo,

Leo Famulari  writes:

> On Tue, Dec 21, 2021 at 04:31:27AM -0500, Timothy Sample wrote:
>> [1] https://gitlab.freedesktop.org/dbus/dbus/-/blob/master/NEWS#L2487
>> [2] https://bugs.freedesktop.org/show_bug.cgi?id=86431#c3
>> 
>> After reading that I tried with the timeout bumped up to a minute, and
>> the X200 booted into GDM just fine, twice in row, and then failed again
>> when I removed the change.  (I should add that it still printed the
>> “elogind is already running” messages, but it worked anyway.)
>
> Okay, great. I don't "auth_timeout" in the Guix source tree. Can you say
> where you adjusted it?

Good question!  :)  Here’s the patch:

diff --git a/gnu/services/dbus.scm b/gnu/services/dbus.scm
index 85a4c3e..a680ed7 100644
--- a/gnu/services/dbus.scm
+++ b/gnu/services/dbus.scm
@@ -106,6 +106,7 @@ (define-syntax directives
 (define (services->sxml services)
   ;; Return the SXML 'includedir' clauses for DIRS.
   `(busconfig
+(limit (@ (name "auth_timeout")) "6")
 (servicehelper "/run/setuid-programs/dbus-daemon-launch-helper")
 
 ;; First, the '.service' files of services subject to activation.


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Leo Famulari
On Tue, Dec 21, 2021 at 04:31:27AM -0500, Timothy Sample wrote:
> [1] https://gitlab.freedesktop.org/dbus/dbus/-/blob/master/NEWS#L2487
> [2] https://bugs.freedesktop.org/show_bug.cgi?id=86431#c3
> 
> After reading that I tried with the timeout bumped up to a minute, and
> the X200 booted into GDM just fine, twice in row, and then failed again
> when I removed the change.  (I should add that it still printed the
> “elogind is already running” messages, but it worked anyway.)

Okay, great. I don't "auth_timeout" in the Guix source tree. Can you say
where you adjusted it?





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-21 Thread Timothy Sample
Hello,

I dug out an old HDD, put it in my X200, and was able to reproduce this.
Eventually I was able to log in to the thing by fiddling with the
“auth_timeout” parameter in the D-Bus config.

Josselin Poiret  writes:

> [...]
>
> Would it be possible to get both the elogind strace and the dbus log for
> the same session?  We could then simply `grep` the authentication cookie
> sent back by dbus to elogind to track the corresponding fd in the dbus
> log.

This is exactly what I did (thanks for posting the patches, Maxim).
While looking around in the D-Bus log, I noticed it complaining about
authentication expiry around where it mentions the PID of elogind.  This
is in Maxim’s log, too.  It counts down an authentication expiry with
messages like “Connection 0x23b9f50 authentication expires in ...”.
That brought me to the D-Bus NEWS file, which mentions adjusting
“auth_timeout” to fix a boot regression for users with older hardware
[1].  The NEWS file mentions a bug report [2] that discusses how this
might be related to hard disk speed.

[1] https://gitlab.freedesktop.org/dbus/dbus/-/blob/master/NEWS#L2487
[2] https://bugs.freedesktop.org/show_bug.cgi?id=86431#c3

After reading that I tried with the timeout bumped up to a minute, and
the X200 booted into GDM just fine, twice in row, and then failed again
when I removed the change.  (I should add that it still printed the
“elogind is already running” messages, but it worked anyway.)

What’s weird is that this bug is very old (2014), and the default
timeout was increased at the time from 5s to 30s to solve the bug.  So
it’s part of the story, but not the whole story.


-- Tim





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-20 Thread Caleb Herbert
I am using an X200 Tablet with an HDD.  This happened on my system.  I 
am now on Fedora with Guix until this issue is fixed.


OpenPGP_0x42E4FDF80F03CA7C.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-16 Thread Michael Rohleder
Hi Guillaume,

Guillaume Le Vaillant  writes:
> I just got this login error when updating an old machine with a HDD as
> storage. On some other faster machines using SSD or NVME storage this
> issue never happened, so I thought the error might be triggered by
> slow IO.
>
> Do some of you also see the issue on fast machines/storage?

I can confirm this. For me, this happens on a machine with a HDD, but
not on another with a SSD nor my laptop with NVME.

(It also "feels" like, it's about timing or so)

-- 
Q: How does a Unix guru have sex?
A: unzip;strip;touch;finger;mount;fsck;more;yes;umount;sleep



signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-16 Thread Josselin Poiret via Bug reports for GNU Guix
Hello Maxim and Ludovic,

Seeing all the new activity on this bug, I decided to take a closer
look.

It doesn't seem to me that the credential byte read is the problem, as
can be seen by the elogind strace: it sends

--8<---cut here---start->8---
374   sendmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0AUTH 
EXTERNAL\r\nDATA\r\n", iov_len=22}, {iov_base="NEGOTIATE_UNIX_FD\r\n", 
iov_len=19}, {iov_base="BEGIN\r\n", iov_len=7}], msg_iovlen=3, 
msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 48
--8<---cut here---end--->8---

to which dbus answers with

--8<---cut here---start->8---
374   recvmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="DATA\r\nOK 
ea94265b9365f7158f1d81ee61b20dab\r\nAGREE_UNIX_FD\r\n", iov_len=256}], 
msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, 
MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 58
--8<---cut here---end--->8---

so dbus doesn't seem to choke on the initial authentication (where the
credentials read happens) and it's clear that the byte 0 is properly
sent at the beginning of the handshake procedure.

Maxim Cournoyer  writes:

> I'm sorry for my lack of deep analysis here, I've spent most of my
> evening attempting to fix my system just to boot ^^'.  I've at least
> managed to collect the following verbose D-Bus log (54 MiB uncompressed)
> which hopefully can shed some light onto how this failure came to be.

Would it be possible to get both the elogind strace and the dbus log for
the same session?  We could then simply `grep` the authentication cookie
sent back by dbus to elogind to track the corresponding fd in the dbus
log.

Best,
Josselin Poiret





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-16 Thread Abhiseck Paira via Bug reports for GNU Guix

Hello,

I am getting the same errors others have mentioned. I can't login via
tty and GDM wouldn't launch either. This happened after I pulled this
commit 73eb941b8b3793aec6110a4ae28bdbfc3ab4f6c5 with guix pull and ran
system reconfigure.

-- 
Abhiseck Paira
E34E 825B 979E EB9F 8505  F80E E93D 353B 7740 0709
https://abhiseck.neocities.org


signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-15 Thread Michael Rohleder
Hi Maxim,

I currently have this (very annoying) issue on _one_ of my machines (two
others work with nearly the same config).
I can't login at all not via console nor ssh or sddm.
I spend some time to reproduce it in a vm, but no success so far.

These are the relevant messages from syslog:
Dec 15 18:15:52 micha dbus-daemon[470]: [system] Failed to activate service 
'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Dec 15 18:16:47 micha dbus-daemon[470]: [system] Activating service 
name='org.freedesktop.login1' requested by ':1.8' (uid=0 pid=899 
comm="/gnu/store/ximad0zvg12r4x0x80mvym8hzg0n33jl-shadow") (using servicehelper)
Dec 15 18:16:47 micha elogind[935]: elogind is already running as PID 558
Dec 15 18:17:12 micha dbus-daemon[470]: [system] Failed to activate service 
'org.freedesktop.login1': timed out (service_start_timeout=25000ms)

-- 
How much does it cost to entice a dope-smoking UNIX system guru to Dayton?
-- Brian Boyle, UNIX/WORLD's First Annual Salary Survey


signature.asc
Description: PGP signature


bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-10 Thread Ludovic Courtès
Hi Maxim!

Maxim Cournoyer  skribis:

[...]

> Oh!  Marius found this interesting issue [0] that they shared on IRC
> today; I wonder if it may be related.  sd-bus apparently pipeline things
> aggressively, which is not always handled well by other D-bus
> participants.
>
> [0]  https://github.com/systemd/systemd/issues/16610

Interesting.

>> Anyway, the “Hello” message is sent to the system bus asynchronously in
>> ‘sd-bus.c’:

[...]

>> static int hello_callback(sd_bus_message *reply, void *userdata, 
>> sd_bus_error *error) {
>>
>> [...]
>>
>> fail:
>> /* When Hello() failed, let's propagate this in two ways: first we 
>> return the error immediately here,
>>  * which is the propagated up towards the event loop. Let's also 
>> invalidate the connection, so that
>>  * if the user then calls back into us again we won't wait any 
>> longer. */
>>
>> bus_set_state(bus, BUS_CLOSING);
>> return r;
>> }
>>
>>
>> It’s not clear from that whether the authors intended for the thing to
>> keep going in case of failure.  In our case it’s not helpful.
>
> If we picture this in the systemd use case, I believe sd-bus must be
> used as *the* d-bus daemon, perhaps?  So it should never die, and expect
> users to call back into it to retry things?  In our case, it acts as a
> D-Bus client, not a server (IIUC), so perhaps its behavior is not tuned
> right for our use case.

I think systemd-logind is a separate process, just like elogind, that
talks over D-Bus; I don’t think there’s any difference here.  It just
seems that this corner case, where we don’t get a reply for ‘Hello’, is
not correctly handled.

> Interesting that you mention this; that's what I worked on yesterday
> (see attached patches).  I managed to get elogind panicking the kernel
> during a guix system reconfigure, which corrupted GRUB, so had to chroot
> and reconfigure from there [1].  Not sure what happened, but it seems
> that killing and restarting elogind is susceptible to cause hard locks.

Ouch!  This is weird.  Note that “everything” depends on elogind, as can
be seen with ‘guix system shepherd-graph’.  So “herd stop elogind” is
not something you should try at home.

Incidentally, can you reproduce the bug in a VM?  That would be much
nicer.

>> We know that elogind is started after dbus-daemon has written its PID
>> file (there’s a Shepherd service dependency).  Is there a possibility
>> that dbus-daemon writes its PID file before it has set
>> ‘new_connection_function’?
>
> Are PID files conventionally agreed to be synchronization primitives?

Yes.  Daemons are supposed to write their PID file once they’re ready to
do their job.

> I'm sorry for my lack of deep analysis here, I've spent most of my
> evening attempting to fix my system just to boot ^^'.  I've at least
> managed to collect the following verbose D-Bus log (54 MiB uncompressed)
> which hopefully can shed some light onto how this failure came to be.

Excerpt:

--8<---cut here---start->8---
366: 0x7f28f396e740: 1639108961.938358 
[dbus-sysdeps-util-unix.c(237):_dbus_write_pid_to_file_and_pipe] writing pid 
file /var/run/dbus/pid
366: 0x7f28f396e740: 1639108961.938438 
[dbus-sysdeps-util-unix.c(291):_dbus_write_pid_to_file_and_pipe] No pid pipe to 
write to
366: 0x7f28f396e740: 1639108961.938474 
[dbus-userdb.c(177):_dbus_user_database_lookup] Using cache for UID 986 
information
366: 0x7f28f396e740: 1639108961.938665 
[dbus-sysdeps-unix.c(3514):_dbus_socketpair] full-duplex pipe 6 <-> 7
366: 0x7f28f396e740: 1639108961.938700 [main.c(719):main] We are on D-Bus...
366: 0x7f28f396e740: 1639108961.938725 [dbus-mainloop.c(884):_dbus_loop_run] 
Running main loop, depth 0 -> 1
366: 0x7f28f396e740: 1639108962.566557 
[dbus-server-socket.c(182):socket_handle_watch] Handling client connection, 
flags 0x1
366: 0x7f28f396e740: 1639108962.566623 [dbus-sysdeps-unix.c(2344):_dbus_accept] 
client fd 8 accepted
366: 0x7f28f396e740: 1639108962.55 
[dbus-server-socket.c(94):handle_new_client_fd_and_unlock] Creating new client 
connection with fd 8
366: 0x7f28f396e740: 1639108962.566779 
[dbus-connection.c(1360):_dbus_connection_new_for_transport] LOCK
366: 0x7f28f396e740: 1639108962.566824 
[dbus-transport-socket.c(180):check_read_watch] fd = 8
366: 0x7f28f396e740: 1639108962.566869 
[dbus-transport-socket.c(227):check_read_watch]   setting read watch enabled = 1

[...]

366: 0x7f28f396e740: 1639108962.568765 
[dbus-transport-socket.c(974):socket_handle_watch] handling read watch 
0x23c2b00 flags = 9
366: 0x7f28f396e740: 1639108962.568807 
[dbus-transport-socket.c(348):exchange_credentials] exchange_credentials: 
do_reading = 1, do_writing = 0
366: 0x7f28f396e740: 1639108962.568862 
[dbus-transport-socket.c(383):exchange_credentials] Failed to read credentials 
Failed to read credentials byte (zero-length read)
366: 0x7f28f396e740: 1639108962.568904 
[dbus-transport.c(510):_dbus_transport_disconnect] start
366: 0x7f28f396e7

bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-09 Thread Ludovic Courtès
Hello!

Maxim Cournoyer  skribis:

> 374   connect(11, {sa_family=AF_UNIX, 
> sun_path="/var/run/dbus/system_bus_socket"}, 34) = 0

[...]

> 374   epoll_wait(5, [{events=EPOLLIN|EPOLLOUT|EPOLLHUP, data={u32=24802800, 
> u64=24802800}}], 20, -1) = 1
> 374   sendmsg(11, {msg_name=NULL, msg_namelen=0, 
> msg_iov=[{iov_base="l\1\0\1\0\0\0\0\1\0\0\0m\0\0\0\1\1o\0\25\0\0\0/org/freedesktop/DBus\0\0\0\3\1s\0\5\0\0\0Hello\0\0\0\2\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0\6\1s\0\24\0\0\0org.freedesktop.DBus\0\0\0\0",
>  iov_len=128}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 
> MSG_DONTWAIT|MSG_NOSIGNAL) = -1 EPIPE (Broken pipe)
> 374   gettid()  = 374
> 374   epoll_ctl(5, EPOLL_CTL_MOD, 11, {events=0, data={u32=24802800, 
> u64=24802800}}) = 0
> 374   timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
> tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, NULL) = 0
> 374   epoll_wait(5, [{events=EPOLLHUP, data={u32=24802800, u64=24802800}}, 
> {events=EPOLLIN, data={u32=24764384, u64=24764384}}], 20, -1) = 2
> 374   read(12, "\1\0\0\0\0\0\0\0", 8)   = 8
> 374   gettid()  = 374
> 374   timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
> tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, NULL) = 0
> 374   epoll_wait(5, [{events=EPOLLHUP, data={u32=24802800, u64=24802800}}, 
> {events=EPOLLIN, data={u32=24764384, u64=24764384}}], 20, -1) = 2
> 374   read(12, "\1\0\0\0\0\0\0\0", 8)   = 8
> 374   gettid()  = 374
> 374   timerfd_settime(12, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
> tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, NULL) = 0
> 374   epoll_wait(5, [{events=EPOLLHUP, data={u32=24802800, u64=24802800}}, 
> {events=EPOLLIN, data={u32=24764384, u64=24764384}}], 20, -1) = 2
> 374   read(12, "\1\0\0\0\0\0\0\0", 8)   = 8
> 374   epoll_ctl(5, EPOLL_CTL_DEL, 11, NULL) = 0
> 374   close(11) = 0
> 374   gettid()  = 374
> 374   epoll_wait(5,  
> 391   <... close resumed>)  = 0
> 391   madvise(0x7fd6c83dc000, 8368128, MADV_DONTNEED) = 0
> 391   exit(0)   = ?
> 391   +++ exited with 0 +++
> 374   <... epoll_wait resumed>[{events=EPOLLERR, data={u32=24768000, 
> u64=24768000}}], 17, -1) = 1
> 374   lseek(7, 0, SEEK_SET) = 0
> 374   read(7, "tty7\n", 63) = 5

As you pointed out on IRC, the initially ‘Hello’ method call above leads
to EPIPE, and we can see that elogind eventually closes its socket to
dbus-daemon *but* keeps doing its thing.

Some interesting things to note…

First, to my surprise, elogind does not use the client library of the
‘dbus’ package:

--8<---cut here---start->8---
$ guix gc --references $(./pre-inst-env guix build elogind)|grep dbus
$ echo $?
1
--8<---cut here---end--->8---

(This is already the case in ‘master’ with v243.7.)  Instead, it has its
own implementation of the DBus protocol, in C, from systemd—we can’t
have enough sources of bugs and vulnerabilities.

Anyway, the “Hello” message is sent to the system bus asynchronously in
‘sd-bus.c’:

--8<---cut here---start->8---
static int bus_send_hello(sd_bus *bus) {
_cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL;
int r;

assert(bus);

if (!bus->bus_client)
return 0;

r = sd_bus_message_new_method_call(
bus,
&m,
"org.freedesktop.DBus",
"/org/freedesktop/DBus",
"org.freedesktop.DBus",
"Hello");
if (r < 0)
return r;

return sd_bus_call_async(bus, NULL, m, hello_callback, NULL, 0);
}
--8<---cut here---end--->8---

A callback is called when a reply is received or an error arises:

--8<---cut here---start->8---
static int hello_callback(sd_bus_message *reply, void *userdata, sd_bus_error 
*error) {

[...]

fail:
/* When Hello() failed, let's propagate this in two ways: first we 
return the error immediately here,
 * which is the propagated up towards the event loop. Let's also 
invalidate the connection, so that
 * if the user then calls back into us again we won't wait any longer. 
*/

bus_set_state(bus, BUS_CLOSING);
return r;
}
--8<---cut here---end--->8---

It’s not clear from that whether the authors intended for the thing to
keep going in case of failure.  In our case it’s not helpful.

But why does dbus-daemon drop the connection in the first place?

To know that, we could change ‘dbus-root-service-type’ to run
dbus-daemon from a ‘--enable-verbose-mode’ build, and with the
‘DBUS_VERBOSE’ environment set to 1.

Looking at ‘dbus-server-socket.c’ it would seem that t

bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-08 Thread Ludovic Courtès
Hi,

Maxim Cournoyer  skribis:

> Dec  7 15:55:02 localhost shepherd[1]: Service dbus-system has been started.
> [...]
> Dec  7 15:55:10 localhost shepherd[1]: Service ntpd has been started.
> [...]
> Dec  7 15:54:46 localhost ntpd[341]: ntpd 4.2.8p15@1.3728-o Thu Jan  1 
> 12:00:01 AM UTC 1970 (1): Starting
> [...]
> Dec  7 15:55:14 localhost shepherd[1]: Service elogind has been started.
> [...]
> Dec  7 15:55:21 localhost shepherd[1]: Service upower-daemon has been started.
> [...]
> Dec  7 15:54:51 localhost dbus-daemon[335]: [system] Activating service 
> name='org.freedesktop.login1' requested by ':1.1' (uid=0 pid=345 
> comm="/gnu/store/g1qlpzcfnk2r6186al2hfqjmq9yl7qkk-upower") (using 
> servicehelper)

The key thing here is that dbus-daemon things elogind is not running and
thus considers it has to start it, which is bound to fail.

You can display the list of services known to the DBus system bus with:

  dbus-send --system --print-reply --dest=org.freedesktop.DBus \
/org/freedesktop/DBus org.freedesktop.DBus.ListNames

I tried in a VM that had booted fine and it shows
“org.freedesktop.login1” as expected.  Could you check what that gives
for you when the bug above shows up?

I wonder if it’s possible for there to be a race, like elogind says
“hi!” to dbus-daemon but dbus-daemon is still sleepy and doesn’t notice.
Seems hard to believe though.

If it’s reproducible, could you instrument ‘elogind-service-type’ such
that elogind is started with “strace -f -o /elogind.log -s 700”?

HTH,
Ludo’.





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-07 Thread Maxim Cournoyer
Hello,

Josselin Poiret  writes:

> Maxim Cournoyer  writes:
>
>> Hello,
>>
>> I've found a workaround: restarting elogind via SSH resolved the issue.
>>
>> I guess itt may be a race between elogind and dbus-system (elogind gets
>> started before dbus-system is fully up, and the communication with the
>> session bus is somehow crippled from there?).
>
> Does this happen with other login managers on your system, like LightDM?
> The thing is that elogind depends on dbus-system, so I'm not sure there
> should be a race there.

I have the same problem using GDM on that machine :-(.

I'll try poking the dbus session manually using dbus-send, or the Scheme
API for it under (guix build jami-service).

Thanks!

Maxim





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-07 Thread Maxim Cournoyer
Hi again,

Maxim Cournoyer  writes:

[...]

> I noticed that dbus system session times out on every boot on that
> machine.  I also notice that when the NTPD daemon starts, it rewinds
> time by about 25 s on every boot

Not NTP related; the same occurs when removing the ntp-service-type from
%desktop-services.

Maxim





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-12-07 Thread Maxim Cournoyer
Hello!

Josselin Poiret  writes:

> Maxim Cournoyer  writes:
>
>> Hello,
>>
>> I've found a workaround: restarting elogind via SSH resolved the issue.
>>
>> I guess itt may be a race between elogind and dbus-system (elogind gets
>> started before dbus-system is fully up, and the communication with the
>> session bus is somehow crippled from there?).
>
> Does this happen with other login managers on your system, like LightDM?
> The thing is that elogind depends on dbus-system, so I'm not sure there
> should be a race there.

I've yet to try this, but it *doesn't* happen on another machine using
an almost-identical configuration (slim + ratpoison).

> I noticed though that none of our desktop managers's shepherd services
> require elogind, and in the case of SLiM not even dbus-system.  Maybe we
> should add it there, since we want Shepherd to handle launching elogind,
> and avoid dbus launching one by itself if the login1 service is used or
> even by the PAM elogind module.  Can you try adding that to
> slim-shepherd-service?

I've tried adding elogind in the requirements of the slim service type
as well as the upower-daemon service type, but it didn't help.

Based on my observations (/var/log/messages), it seems that the dbus
session bus is not yet ready when elogin starts, and thus the various
dbus services are not yet published/registered when other dependents
need them.

I noticed that dbus system session times out on every boot on that
machine.  I also notice that when the NTPD daemon starts, it rewinds
time by about 25 s on every boot

--8<---cut here---start->8---
[...]
Dec  7 15:55:02 localhost shepherd[1]: Service dbus-system has been started.
[...]
Dec  7 15:55:10 localhost shepherd[1]: Service ntpd has been started.
[...]
Dec  7 15:54:46 localhost ntpd[341]: ntpd 4.2.8p15@1.3728-o Thu Jan  1 12:00:01 
AM UTC 1970 (1): Starting
[...]
Dec  7 15:55:14 localhost shepherd[1]: Service elogind has been started.
[...]
Dec  7 15:55:21 localhost shepherd[1]: Service upower-daemon has been started.
[...]
Dec  7 15:54:51 localhost dbus-daemon[335]: [system] Activating service 
name='org.freedesktop.login1' requested by ':1.1' (uid=0 pid=345 
comm="/gnu/store/g1qlpzcfnk2r6186al2hfqjmq9yl7qkk-upower") (using servicehelper)
[...]
Dec  7 15:55:02 localhost elogind[352]: elogind is already running as PID 343
[...]
Dec  7 15:55:16 localhost dbus-daemon[335]: [system] Failed to activate service 
'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
[...]
Dec  7 15:55:24 localhost dbus-daemon[335]: [system] Connection has not 
authenticated soon enough, closing it (auth_timeout=3ms, elapsed: 36684ms)
--8<---cut here---end--->8---


> Also, maybe patching the dbus service to Exec=false instead would be a
> good safeguard against dbus launching elogind itself.

Perhaps a good idea to try also!

Attached is the /var/log/messages of the problematic boot, nearly
unedited.

Dec  7 15:54:11 localhost syslogd (GNU inetutils 2.0): restart
Dec  7 15:54:11 localhost vmunix: [0.00] Linux version 5.15.6-gnu 
(guix@guix) (gcc (GCC) 10.3.0, GNU ld (GNU Binutils) 2.37) #1 SMP 1
Dec  7 15:54:11 localhost vmunix: [0.00] Command line: 
BOOT_IMAGE=/@root/gnu/store/xz7ybgzdr0kq8wx9czbfipxwsg6i5ydw-linux-libre-5.15.6/bzImage
 --root=/dev/mapper/cryptroot 
--system=/gnu/store/mnqz2jb567kiflbh6afhn1zsv9dyz9j2-system 
--load=/gnu/store/mnqz2jb567kiflbh6afhn1zsv9dyz9j2-system/boot quiet 
snd_hda_intel.dmic_detect=0 modprobe.blacklist=rtl8187
Dec  7 15:54:11 localhost vmunix: [0.00] KERNEL supported cpus:
Dec  7 15:54:12 localhost vmunix: [0.00]   Intel GenuineIntel
Dec  7 15:54:12 localhost vmunix: [0.00]   AMD AuthenticAMD
Dec  7 15:54:12 localhost vmunix: [0.00]   Hygon HygonGenuine
Dec  7 15:54:12 localhost vmunix: [0.00]   Centaur CentaurHauls
Dec  7 15:54:12 localhost vmunix: [0.00]   zhaoxin   Shanghai  
Dec  7 15:54:12 localhost vmunix: [0.00] x86/fpu: x87 FPU will use 
FXSAVE
Dec  7 15:54:13 localhost vmunix: [0.00] signal: max sigframe size: 1440
Dec  7 15:54:13 localhost vmunix: [0.00] BIOS-provided physical RAM map:
Dec  7 15:54:13 localhost vmunix: [0.00] BIOS-e820: [mem 
0x-0x0009fbff] usable
Dec  7 15:54:13 localhost vmunix: [0.00] BIOS-e820: [mem 
0x0009fc00-0x0009] reserved
Dec  7 15:54:13 localhost vmunix: [0.00] BIOS-e820: [mem 
0x000e4000-0x000f] reserved
Dec  7 15:54:13 localhost vmunix: [0.00] BIOS-e820: [mem 
0x0010-0xbff7] usable
Dec  7 15:54:14 localhost vmunix: [0.00] BIOS-e820: [mem 
0xbff8-0xbff8dfff] ACPI data
Dec  7 15:54:14 localhost vmunix: [0.00] BIOS-e820: [mem 
0xbff8e000-0xbffd] ACPI NVS
Dec  7 15:54:14 localhost vmunix: [0.00] BIOS-e820: [mem 
0x00

bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-11-26 Thread Josselin Poiret via Bug reports for GNU Guix
Maxim Cournoyer  writes:

> Hello,
>
> I've found a workaround: restarting elogind via SSH resolved the issue.
>
> I guess itt may be a race between elogind and dbus-system (elogind gets
> started before dbus-system is fully up, and the communication with the
> session bus is somehow crippled from there?).

Does this happen with other login managers on your system, like LightDM?
The thing is that elogind depends on dbus-system, so I'm not sure there
should be a race there.

I noticed though that none of our desktop managers's shepherd services
require elogind, and in the case of SLiM not even dbus-system.  Maybe we
should add it there, since we want Shepherd to handle launching elogind,
and avoid dbus launching one by itself if the login1 service is used or
even by the PAM elogind module.  Can you try adding that to
slim-shepherd-service?

Also, maybe patching the dbus service to Exec=false instead would be a
good safeguard against dbus launching elogind itself.

Tangentially: SLiM is very old and unmaintained, so it may one day
simply break with newer versions of systemd/elogind.

Best,
Josselin Poiret





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-11-25 Thread Ludovic Courtès
Hi,

Maxim Cournoyer  skribis:

> Nov 24 21:23:58 localhost dbus-daemon[341]: [system] Activating service 
> name='org.freedesktop.login1' requested by ':1.16' (uid=0 pid=324 
> comm="/gnu/store/ximad0zvg12r4x0x80mvym8hzg0n33jl-shadow") (using 
> servicehelper)
> Nov 24 21:23:58 localhost elogind[1114]: elogind is already running as PID 355
> Nov 24 21:24:11 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-BEACON-LOSS 
> Nov 24 21:24:21 localhost last message repeated 5 times
> Nov 24 21:24:23 localhost dbus-daemon[341]: [system] Failed to activate 
> service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
> Nov 24 21:24:23 localhost shepherd[1]: Respawning term-tty2. 
> Nov 24 21:24:23 localhost shepherd[1]: Service host-name has been started. 
> Nov 24 21:24:23 localhost shepherd[1]: Service term-tty2 has been started. 
> Nov 24 21:24:23 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-BEACON-LOSS 
> Nov 24 21:24:27 localhost last message repeated 3 times
> Nov 24 21:26:04 localhost dbus-daemon[341]: [system] Activating service 
> name='org.freedesktop.login1' requested by ':1.17' (uid=0 pid=429 
> comm="/gnu/store/nvvmksc9pvahqmypaz3h8mqya82vnga8-slim-1") (using 
> servicehelper)
> Nov 24 21:26:04 localhost elogind[1127]: elogind is already running as PID 355
> Nov 24 21:26:29 localhost dbus-daemon[341]: [system] Failed to activate 
> service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)

It looks like elogind things it’s already running as PID 355, where the
PID comes from its PID file; there’s indeed a process with this PID, but
is it really elogind?  Or could it be that there’s a stale PID file?

Does /var/log/messages shows the very first time elogind is started at
boot time?  With which PID?

We need to find out why dbus-daemon thinks it needs to restart elogind.

Ludo’.





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-11-24 Thread Maxim Cournoyer
Hello,

I've found a workaround: restarting elogind via SSH resolved the issue.

I guess itt may be a race between elogind and dbus-system (elogind gets
started before dbus-system is fully up, and the communication with the
session bus is somehow crippled from there?).

I'll experiment with the elogind service a bit, adding a dumb sleep to
its start slot to see if it prevents the race.

Thank you,

Maxim





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-11-24 Thread Maxim Cournoyer
Hello Josselin,

Josselin Poiret  writes:

> Hello Maxim,
>
> Maxim Cournoyer  writes:
>
>> --8<---cut here---start->8---
>> Nov 23 01:09:14 localhost dbus-daemon[383]: [system] Activating
>> service name='org.freedesktop.login1' requested by ':1.17' (uid=0
>> pid=370
>> comm="/gnu/store/ximad0zvg12r4x0x80mvym8hzg0n33jl-shadow") (using 
>> servicehelper)
>> Nov 23 01:09:14 localhost elogind[1189]: elogind is already running as PID 
>> 390
>> Nov 23 01:09:20 localhost shepherd[1]: Respawning term-tty1. 
>> Nov 23 01:09:20 localhost shepherd[1]: Service host-name has been started. 
>> Nov 23 01:09:20 localhost shepherd[1]: Service term-tty1 has been started.
>> Nov 23 01:09:39 localhost dbus-daemon[383]: [system] Failed to
>> activate service 'org.freedesktop.login1': timed out
>> (service_start_timeout=25000ms)
>> --8<---cut here---end--->8---
>>
>> I don't remember if I saw the slim login screen; but in any case I
>> couldn't successfully login even via a ptty.
>>
>> It may have to do with polkit.
>>
>> To be investigated.
>>
>> This happened on a system *not* using gdm (it uses slim) and with
>> ratpoison as the WM, on commit f42bc604547d9ee8e35fcd66d5db7786954cfac3
>> of the core-updates-frozen branch.
>>
>> To be investigated.
>
> I cannot reproduce in a fresh VM on commit
> d5de4e163ccef80f78bc5fe330f568d8fe3a23ab, and can login just fine, with
>
>   (services (cons* (service slim-service-type (slim-configuration))
>(modify-services %desktop-services
>  (delete gdm-service-type
>
> Is this still affecting you?

Yes!  It didn't occur in a 'guix system vm my-config.scm', but the exact
same config deployed on my machine fails at login.

Some symptoms:

1. Slim login screen comes up, but after entering credentials Xorg
resets (back to login screen)

2. going to a TTY and attempting to login there, it'd fail with a "Login
failed after 60 s timeout" or similar error.

3. I can login via SSH (thanks goodness!)

4. There are no errors (EE) in /var/log/Xorg.0.log

5. here's the tail of my /var/log/messages:

--8<---cut here---start->8---
Nov 24 21:23:54 localhost ntpd[346]: Soliciting pool server 216.197.156.83
Nov 24 21:23:55 localhost ntpd[346]: Soliciting pool server 206.108.0.133
Nov 24 21:23:56 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-BEACON-LOSS 
Nov 24 21:23:56 localhost ntpd[346]: Soliciting pool server 98.143.85.249
Nov 24 21:23:57 localhost ntpd[346]: Soliciting pool server 192.95.27.155
Nov 24 21:23:58 localhost dbus-daemon[341]: [system] Activating service 
name='org.freedesktop.login1' requested by ':1.16' (uid=0 pid=324 
comm="/gnu/store/ximad0zvg12r4x0x80mvym8hzg0n33jl-shadow") (using servicehelper)
Nov 24 21:23:58 localhost elogind[1114]: elogind is already running as PID 355
Nov 24 21:24:11 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-BEACON-LOSS 
Nov 24 21:24:21 localhost last message repeated 5 times
Nov 24 21:24:23 localhost dbus-daemon[341]: [system] Failed to activate service 
'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Nov 24 21:24:23 localhost shepherd[1]: Respawning term-tty2. 
Nov 24 21:24:23 localhost shepherd[1]: Service host-name has been started. 
Nov 24 21:24:23 localhost shepherd[1]: Service term-tty2 has been started. 
Nov 24 21:24:23 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-BEACON-LOSS 
Nov 24 21:24:27 localhost last message repeated 3 times
Nov 24 21:26:04 localhost dbus-daemon[341]: [system] Activating service 
name='org.freedesktop.login1' requested by ':1.17' (uid=0 pid=429 
comm="/gnu/store/nvvmksc9pvahqmypaz3h8mqya82vnga8-slim-1") (using servicehelper)
Nov 24 21:26:04 localhost elogind[1127]: elogind is already running as PID 355
Nov 24 21:26:29 localhost dbus-daemon[341]: [system] Failed to activate service 
'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Nov 24 21:26:29 localhost shepherd[1]: Respawning xorg-server. 
Nov 24 21:26:29 localhost shepherd[1]: Service host-name has been started. 
Nov 24 21:26:29 localhost shepherd[1]: Service xorg-server has been started. 
Nov 24 21:27:23 localhost ntpd[346]: Soliciting pool server 209.115.181.108
Nov 24 21:27:24 localhost ntpd[346]: Soliciting pool server 138.197.153.200
Nov 24 21:27:25 localhost ntpd[346]: Soliciting pool server 162.159.200.123
Nov 24 21:27:26 localhost ntpd[346]: Soliciting pool server 162.159.200.1
Nov 24 21:29:09 localhost ntpd[346]: kernel reports TIME_ERROR: 0x41: Clock 
Unsynchronized
Nov 24 21:29:42 localhost ntpd[346]: Soliciting pool server 199.182.221.110
Nov 24 21:35:23 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-SIGNAL-CHANGE 
above=0 signal=-87 noise=-95 txrate=27
Nov 24 21:35:32 localhost wpa_supplicant[343]: wlp4s0: CTRL-EVENT-SIGNAL-CHANGE 
above=1 signal=-67 noise=-95 txrate=27
Nov 24 21:42:50 localhost dbus-daemon[341]: [system] Activating service 
name='o

bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-11-24 Thread Josselin Poiret via Bug reports for GNU Guix
Hello Maxim,

Maxim Cournoyer  writes:

> --8<---cut here---start->8---
> Nov 23 01:09:14 localhost dbus-daemon[383]: [system] Activating service 
> name='org.freedesktop.login1' requested by ':1.17' (uid=0 pid=370
> comm="/gnu/store/ximad0zvg12r4x0x80mvym8hzg0n33jl-shadow") (using 
> servicehelper)
> Nov 23 01:09:14 localhost elogind[1189]: elogind is already running as PID 390
> Nov 23 01:09:20 localhost shepherd[1]: Respawning term-tty1. 
> Nov 23 01:09:20 localhost shepherd[1]: Service host-name has been started. 
> Nov 23 01:09:20 localhost shepherd[1]: Service term-tty1 has been started.
> Nov 23 01:09:39 localhost dbus-daemon[383]: [system] Failed to activate 
> service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
> --8<---cut here---end--->8---
>
> I don't remember if I saw the slim login screen; but in any case I
> couldn't successfully login even via a ptty.
>
> It may have to do with polkit.
>
> To be investigated.
>
> This happened on a system *not* using gdm (it uses slim) and with
> ratpoison as the WM, on commit f42bc604547d9ee8e35fcd66d5db7786954cfac3
> of the core-updates-frozen branch.
>
> To be investigated.

I cannot reproduce in a fresh VM on commit
d5de4e163ccef80f78bc5fe330f568d8fe3a23ab, and can login just fine, with

  (services (cons* (service slim-service-type (slim-configuration))
   (modify-services %desktop-services
 (delete gdm-service-type

Is this still affecting you?

Best,
Josselin Poiret





bug#52051: [core-updates-frozen] cannot login ('org.freedesktop.login1' service times out)

2021-11-22 Thread Maxim Cournoyer
Hello,

I'm not 100% sure this is the cause, but these are the last messages I
have before I rebooted:

--8<---cut here---start->8---
Nov 23 01:09:14 localhost dbus-daemon[383]: [system] Activating service 
name='org.freedesktop.login1' requested by ':1.17' (uid=0 pid=370
comm="/gnu/store/ximad0zvg12r4x0x80mvym8hzg0n33jl-shadow") (using 
servicehelper)
Nov 23 01:09:14 localhost elogind[1189]: elogind is already running as PID 390
Nov 23 01:09:20 localhost shepherd[1]: Respawning term-tty1. 
Nov 23 01:09:20 localhost shepherd[1]: Service host-name has been started. 
Nov 23 01:09:20 localhost shepherd[1]: Service term-tty1 has been started.
Nov 23 01:09:39 localhost dbus-daemon[383]: [system] Failed to activate service 
'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
--8<---cut here---end--->8---

I don't remember if I saw the slim login screen; but in any case I
couldn't successfully login even via a ptty.

It may have to do with polkit.

To be investigated.

This happened on a system *not* using gdm (it uses slim) and with
ratpoison as the WM, on commit f42bc604547d9ee8e35fcd66d5db7786954cfac3
of the core-updates-frozen branch.

To be investigated.

Thanks,

Maxim