On Wed, 30.05.12 22:36, Marti Raudsepp (ma...@juffo.org) wrote: Heya,
> On Tue, May 29, 2012 at 11:29 PM, Lennart Poettering > <lenn...@poettering.net> wrote: > > The journal is still very new. I think so far it is quite > > stable, but there is definitely more work necessary to make it rock > > solid in all corner cases. > > Well, any concrete ideas? Locking the user out of his/her own system > is the best way to become hated by sysadmins. I certainly don't want > to see journald on my servers until that's addressed. > > Maybe a client-side timeout in libsystemd-journal? While that could > still effectively crash applications by slowing them down too much, at > least it's possible to log in to inspect and fix the issue. Well, this is hard to fix and inherent to the syslog client side, i.e. glibc's syslog() call. It's synchronous and hence SIGSTOPing your syslog daemon of choice will eventually cause the whole system to freeze. Many people have configured their classic syslog daemon to output logs on /dev/tty12. If you press C-s there (or accidentally hit Scroll Lock) you end up freezing syslog too and thus freezing the entire machine sooner or later. It's kind of a known problem. Not sure what we could do about this. Sure, it is easy to fix the journal client libraries and have a timeout in there, but only the smallest number of clients use the native libraries, most go via glibc's syslog() call, and that is implemented synchronous. Fixing glibc would definitely be a good idea though. Adding the timeout change there (which would actually be dead-easy, simply by using SO_SNDTIMEO) would not really fix the problem too well though: given the amount of messages that are generated the system might not be locked up entirely but still very slow. I think avoid problems like this is mostly a problem of giving it a bit of robustness love, and testing. Thankfully we now have Harald's test system logic, and it is definitely my intention to get this extended so that we can automatically test systemd in all kinds of memory/disk space pressure problems. In short: this is probably best fixed by rigorous, automated testing in low-memory/low-disk space situations, not by adding timeouts which would ameliorate the situation only minimally. Lennart -- Lennart Poettering - Red Hat, Inc. _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel