Re: Ton of random units "could not be found"

2023-12-16 Thread Uoti Urpala
On Sat, 2023-12-16 at 15:31 +0100, Lennart Poettering wrote:
> On Fr, 15.12.23 22:17, chandler (s...@riseup.net) wrote:
> >     Other items have different situations, like tmp.mount exists at
> > /usr/share/systemd/tmp.mount but isn't an enabled unit or anything, if I
> > try to enable or unmask it I'm just told "Unit tmp.mount could not be
> > found." or "Unit file tmp.mount does not exist."
> 
> /usr/share/systemd/ is not a directory systemd ever looks into for
> unit files. If debian packaged something there, this smells like a
> bug. Please report to your distro.

Debian does not use tmpfs for /tmp by default. The unit is placed there
because it's intentionally not in use unless you enable it (and not
just by "systemctl enable", I believe it's done this way to prevent any
dependencies in other units from accidentally activating it, possibly
at a moment when it would hide already existing contents of /tmp).


Re: [systemd-devel] how to let systemd hibernate start/stop the swap area?

2023-03-31 Thread Uoti Urpala
On Sat, 2023-04-01 at 06:16 +1100, Michael Chapman wrote:
> On Fri, 31 Mar 2023, Lennart Poettering wrote:
> [...]
> > Presumably your system mmaps ELF binaries, VM images, and similar
> > stuff into memory. if you don't allow anonymous memory to backed out
> > onto swap, then you basically telling the kernel "please page out
> > my program code out instead". Which is typically a lot worse.
> 
> Yes, but my point is that it _doesn't matter_ if SSH or journald or 
> whatever is in memory or needs to be paged back in again. It's such a tiny 
> fraction of the system's overall workload.

That contradicts what you said earlier about the system actually
writing a significant amount of data to swap. If, when swap was
enabled, the system wrote a large amount of data to the swap, that
implies there must be a large amount of some other data that it was
able to keep in memory instead. Linux should not write all information
from memory to swap just to leave the memory empty and without any
useful content - everything written to swap should correspond to
something else kept in memory.

So if you say that the swap use was overall harmful for behavior,
claiming that the *size* of other data kept in memory was too small to
matter doesn't really make any sense. If the swap use was significant,
then it should have kept a significant amount of some other data in
memory, either for the main OS or for the guests.



Re: [systemd-devel] Smooth upgrades for socket activated services

2023-02-21 Thread Uoti Urpala
On Mon, 2023-02-20 at 12:22 +0100, Mike Hearn wrote:
> I see. So basically you have to keep the service running across the
> upgrade and then wait for it to shut down due to inactivity, then be
> restarted by systemd to make the update apply. Or alternatively you
> could make the app detect that it's been updated, stop accepting new
> connections, finish servicing the old connections, and then shut
> itself down once all existing connections are finished. On restart
> it'd then be using the new code, re-accept the socket from systemd
> and start accepting again.

Instead of "detect that it's been updated", I believe a more common and
recommendable approach would be to make it part of the daemon's normal
clean shutdown (for daemons where this behavior is appropriate). That
is, stop accepting new connections from the listening socket, but
finish serving already accepted connections. Then the "restart" part
alone is enough to switch to a new version without losing connections
(at least if things don't take so long that connections time out).



Re: [systemd-devel] Systemd hang when restarting a service during shutdown

2021-11-08 Thread Uoti Urpala
On Mon, 2021-11-08 at 12:05 +0100, Sean Nyekjaer wrote:
> Regarding,
> https://github.com/systemd/systemd/issues/21203
> 
> I think the point of the issue missed when the issue got closed.
> 
> We have a service that is changing configs for systemd-networkd and
> issuing a `systemctl restart systemd-networkd`.
> An other service is checking uptime and issues a `systemctl reboot`,
> when our max uptime have been exeeced.
> If restart of systemd-networkd happens while a reboot is in progress,
> the system will hang "forever" (and continue to pet the watchdog).

The issue shows you using "systemctl start systemd-reboot". That is not
the right way to reboot. Use "systemctl reboot" instead. I suspect this
is related to why the reboot may stop partway: your command does not
start the reboot tasks in "irreversible" mode, which means that any
following contrary command, such as explicitly (re)starting a unit that
was going to be shut down, is going to implicitly cancel the
conflicting reboot action.

You should also be using "try-restart" instead of "restart". If your
intent is to change configs, you want to say "make sure old configs are
not in use" rather than "enforce that the service is running now". (I
think making the "restart" command have "start" semantics was a design
mistake, and the "try-restart"/"restart" pair would have been better
named "restart"/"start-or-restart".)


Re: [systemd-devel] Reloading configuration after mount unit

2021-06-18 Thread Uoti Urpala
On Fri, 2021-06-18 at 19:48 +0200, Norbert Lange wrote:
> If systemd assumes the whole /usr drive to be static and has no way to
> dynamically reload and "retarget"
> (adding new wants/requires dependencies to starting/started targets)
> then I guess that's the end of it.

Systemd does not necessarily assume the whole of /usr to be mounted at
once - you could have binaries and data in submounts and systemd could
wait for those - but there is no "normal" way to add units in the
middle of operation. There is "daemon-reload", but it's more meant for
things like online updates, not part of a standard boot sequence.
Normally the complete set of units to start in a boot should be known
early, not appended to in parts as things are mounted.

So you could have tools in a separately-mounted /usr/local, but I think
you'd need to have the systemd configuration for them in the main /usr
to have things behave nicely.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd unit transition timestamps

2021-06-10 Thread Uoti Urpala
On Thu, 2021-06-10 at 10:49 +, paul.niel...@fujitsu.com wrote:
> So, anyways, I don't see the difference between the units that causes the
> different behavior. Furthermore, from my point of view (as a user) it
> contradicts the description of the Timestamp values in the man page somehow,
> where it says "recorded on this boot".
> 
> Is this behaviour intended? Or is there another way to read the times a unit
> was stopped, without setting up my own event listener or searching the
> (potentially rotated/vacuumed) journal?

My guess is that after stopping the units, nothing references them any
more, and thus systemd garbage collects the unit data structures.
Systemd does not keep a record of all previously-used but no longer
relevant units in memory. Thus no record remains (outside logs) that
the unit was ever running, and there is no data stored about it. In the
docker.service case there may be another active unit that references
docker.service and thus keeps it loaded in memory.


> Is this behaviour intended? Or is there another way to read the times a unit
> was stopped, without setting up my own event listener or searching the
> (potentially rotated/vacuumed) journal?

I don't believe that the systemd daemon keeps any record about no-
longer-relevant units that were once active in the past, so there is
nothing to read later. So the alternatives are to read logs for past
events, record the event as it happens with a listener, or possibly
create a dummy active service that references the one you care about
and so keeps the data structure inside systemd alive.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Storing package metadata in ELF objects

2021-05-19 Thread Uoti Urpala
On Wed, 2021-05-19 at 02:19 +0200, Guillem Jover wrote:
> So this is where I guess I'm missing something. To be able to make
> sense of the coredumps there are two things that might end up being
> relevant, backtraces and source code. systemd-coredump might already

I understood Luca's point to be about more basic metadata. Not in the
context of a human analyzing coredumps, but having basic information
about *what* crashed in log files.

So when the system gets a coredump, the goal is to have more
information in the logs than just "some process crashed in the
container" and name of the binary. Instead the logs would contain
information about what package/version the binary is from (or at least
claims to be from). And writing log files should not involve loading
any extra data from the network.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] Re: Still confused with socket activation

2021-02-08 Thread Uoti Urpala
On Mon, 2021-02-08 at 11:27 +0100, Reindl Harald wrote:
> this is *not* what systemd-sockets are for
> they are for service is started at the first connect

This is wrong. Socket units are useful completely independently of
whether the unit is started on demand, and it's a good idea to use them
even for services that are always started on boot. They allow
configuring listening ports in a consistent manner, and make it
possible to avoid direct dependencies between services. The latter
pretty much avoids all further issues with ordering: once you've
started all the sockets, you can freely start all the services in
parallel or in whatever order - a depended-on service process starting
later is never a problem, since requests will just get queued in the
socket and will work fine once the service is fully up. In principle,
you could even have two services which both require the other, as long
as the exact requests they make will not result in a deadlock. In
almost any setup at least the improved parallelism improves performance
at boot or when otherwise starting services.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Journald retaining logs for only 10 days

2020-11-14 Thread Uoti Urpala
On Sat, 2020-11-14 at 09:31 +, Nikolaus Rath wrote:
> # journalctl --disk-usage
> Archived and active journals take up 320.0M in the file system.
> 
> # journalctl > alllogs
> # ls -lh alllogs 
> -rw-r--r-- 1 root root 27M Nov 14 09:24 alllogs

The journal stores a lot of metadata for each log entry, so the
"alllogs" size is not a good indicator of disk space requirements.
Use "journalctl -o verbose" to see all information that is actually
stored.

So basically I believe this is (at least mostly) a case of the
configured space not being enough to store more logs, given all the
metadata space requirements.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd unit timer

2020-08-10 Thread Uoti Urpala
On Mon, 2020-08-10 at 20:19 +0100, Dave Howorth wrote:
> On Mon, 10 Aug 2020 20:21:51 +0200
> Lennart Poettering  wrote:
> > i.e. it unifies how system programs are invoked, and that's a good
> > thing. it turns time-based activation into "just another type of
> > activation".
> 
> Most of that has gone over my head so some examples would probably help
> me to understand. Perhaps they're in the git logs?
> 
> But I'm not normally running system code in cronjobs. I usually run
> either scripts I have written myself, or backup commands and the like.

Even many of the simplest scripts will most likely benefit from basic
systemd unit functionality like having correct journal metadata
("something logged from foo.service" as opposed to "something logged
from (child process of) cron.service").


> If I wanted to run a service, presumably I could just write a 'manual'
> invocation as a cron or at job? I'm not seeing the big imperative to
> create another new bunch of code to learn and maintain. I expect I'm
> blind.

You can view "running code at a specified time" as having two parts:
choosing the time when to start it, and the general ability to run code
in a specified environment (user, sandboxing, resource limits,
dependencies, etc). Cron does the first, but systemd units do the
second a lot better. If you want to have support for both, you either
need to add most of the stuff in systemd units to cron, or timer units
to systemd. The second is a much saner option.

Basically, you want to have support for everything in unit files also
for code that is started based on time. This means that having cron
directly running code is not a real option. And having cron running
"systemctl start" commands is kludgey and has enough problems to
justify native timer units.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Can service of timers.target having After=multi-user.target create a loop?

2020-07-12 Thread Uoti Urpala
On Sun, 2020-07-12 at 17:13 +0300, Andrei Borzenkov wrote:
> 12.07.2020 16:21, Amish пишет:
> > I have a timer file like this:
> > 
> > [Unit]
> > Description=Foo
> > After=multi-user.target
> > 
> > [Timer]
> > OnCalendar=*:0/5
> > Persistent=false
> > 
> > [Install]
> > WantedBy=timers.target


> > Because AFAIK timers.target runs before multi-user.target. But here
> > something inside timers.target waits for multi-user.target.
> > 
> > So how does systemd resolve this loop?
> > 
> 
> There is no loop. There is no transitive dependency between timer unit
> and service unit. Timer unit gets started early and enqueues start job
> for service unit; this start job waits for multi-user.target according
> to After dependency. Mulitple invocation of timer will try enqueue start
> job again which will simply be merged with existing pending request.

But shouldn't the After line be in the .service file, not the timer, in
this case? The timer should be ready early if it's activated by
timers.target, the service should wait before running.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] ReadWriteDirectories directive in service file?

2020-06-11 Thread Uoti Urpala
On Thu, 2020-06-11 at 11:39 -0400, Bruce A. Johnson wrote:
> I'm trying to figure out how to resolve these errors that are preventing
> one of my services from running, and I'm kind of at a loss. Systemd is
> stumbling over a read-write directory that needs to be created for the
> service.
> 
> > Jun 04 09:44:03 url-000db95361f2 systemd[3819]: rl-web.service: Failed
> > to set up mount namespacing: /run/systemd/unit-root/run/rl-web/tmp: No
> > such file or directory

> I cannot find the /ReadWriteDirectory/ directive I used in my original
> service file in the current systemd documentation. I tried replacing it
> with /ReadWritePaths/ and threw in /ProtectSystem=True/. (The original
> service file is below.)

> ReadWriteDirectories=/run/rl-web/tmp


I believe the cause of the error is that the directory /run/rl-web/tmp
does not exist when trying to create the namespace. You can only mount
paths that already exist. Why do you have this line anyway? /run is
writable by default, and I don't see anything which would restrict
that. ProtectSystem level "true" does not affect /run.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Requested transaction contradicts existing jobs: start is destructive

2020-04-30 Thread Uoti Urpala
On Thu, 2020-04-30 at 22:18 +0530, Kumar Kartikeya Dwivedi wrote:
> waiting for the stop request initiated previously to finish). Even if
> you use fail as the job mode, the error you get back is "Transaction
> is destructive" or so, not this.

IIRC the patch he mentioned in his mail was one that changed the error
message from "Transaction is destructive" to this.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Requested transaction contradicts existing jobs: start is destructive

2020-04-27 Thread Uoti Urpala
On Mon, 2020-04-27 at 11:40 +0100, Mark Bannister wrote:
> I'm not sure where to begin to troubleshoot the problem.  The systemd
> error is occurring sporadically and I haven't yet found a way to
> reproduce it.  I also can't say I actually understand the error
> message or what I'm supposed to do about it.  How is it even possible
> for stop and start jobs to clash like this?

See the systemctl documentation for the --job-mode option. Basically
this should be due to an external request from somewhere to start the
unit while systemd is in the middle of processing a previously started
stop command for it.

The log messages suggest that this might be related to trying to start
a new ssh session while the per-user slice is still being shut down
after closing all previous login sessions; I'm not familiar enough with
this area to tell how things are supposed to work in this situation.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] How does systemd (pid1) connect to system DBus?

2020-02-04 Thread Uoti Urpala
On Mon, 2020-02-03 at 19:01 +, Dimitri John Ledkov wrote:
> I see that systemd pid1 manager is available on the system DBus.
> 
> But when/how does it connect to it?

unit_notify() calls manager_recheck_dbus(), which connects to the bus
if dbus.service is running.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] No error even a Required= service does not exist

2019-11-25 Thread Uoti Urpala
On Mon, 2019-11-25 at 13:22 +, mikko.rap...@bmw.de wrote:
> Maybe you need Wants instead of Requires in the service file.

I don't think so. "Wants" is in the opposite direction - it explicitly
does not require the other unit to successfully start. Even with
"After=" specified, it just makes an attempt to start the other unit,
and will then start this unit whether that succeeded or failed.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] No error even a Required= service does not exist

2019-11-25 Thread Uoti Urpala
On Mon, 2019-11-25 at 15:19 +0200, Mantas Mikulėnas wrote:
> > Requires=xyz.service 
> > 
> > produces no complaint and starts the service even if there is no xyz.service
> > Is this the normal behavior or can I configure systemd to throw an error in 
> > this case?
> 
> The docs say you can get this behavior if you also have After=xyz.service. 
> (Not entirely sure why.)

No when there IS NOT an "After=xyz.service".

Without "After=", there is no ordering dependency - it just tells that
anything starting this unit will effectively order the start of the
other as well. Without ordering, this unit can be the one to start
first. If the other one fails to actually start later, that doesn't
make systemd go back to stop this one (note that this is consistent
with ordering dependencies - if a depended-on service fails later
during runtime, that does not automatically force a stop of already
running depending services). I guess this logic extends to failures of
the "does not exist at all" type where there was never a chance of
successfully starting the unit.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd-nspawn isolation potentially causing issues with distccmon-text

2019-11-13 Thread Uoti Urpala
On Wed, 2019-11-13 at 10:24 -0500, John wrote:
> I am using systemd-nspawn to compile in a clean environment.  My
> distcc cluster happily accepts requests from the container's build,
> but the monitoring utility, distccmon-text, shows no output. I invoked
> it defining the DISTCC_DIR variable to the correct directory in the
> container.

> Link to strace from the container:
> https://gist.github.com/graysky2/0886025b60335de4c0b19ddf11f7aafb

Your description is somewhat unclear. I'm assuming that this is
actually a strace from OUTSIDE the container (as in, you are not
running the distcc-mon program inside the container, but running it on
the host system and only giving it a path to a filesystem location used
by the in-container compilation process), and that this is the case you
are trying to get working.

I believe the problem is that the program reads PID values from the
filesystem, but PIDs are not the same inside the container and outside.
Thus recording a PID value inside the container and then trying to use
that PID to find the same process from the host system will not work.

If your container runs as a full enough machine with its own systemd
and dbus, then the simplest solution is likely to run the monitoring
utility in the container, for example with:
machinectl shell  


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd-growfs blocks boot until completed

2019-09-28 Thread Uoti Urpala
On Fri, 2019-09-27 at 17:12 +0200, Mirza Krak wrote:
> This is what the systemd-growfs@.service looks like:
> 
> # Automatically generated by systemd-fstab-generator
> [Unit]
> Description=Grow File System on %f
> Documentation=man:systemd-growfs@.service(8)
> DefaultDependencies=no
> BindsTo=%i.mount
> Conflicts=shutdown.target
> After=%i.mount
> Before=shutdown.target local-fs.target
> 
> [Service]
> Type=oneshot
> RemainAfterExit=yes
> ExecStart=/lib/systemd/systemd-growfs /data
> TimeoutSec=0

The "Before=local-fs.target" means that local filesystems will not be
considered mounted before the oneshot unit has completed. Lots of other
stuff in boot will depend on local filesystems being mounted, so this
will effectively block boot. I don't think this has changed between
systemd versions - the first growfs version seems to have a similar
line. All generated growfs units have such a dependency; the only thing
that could change is the target, it could be "remote-fs.target" if the
filesystem is considered a remote one.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd's connections to /run/systemd/private ?

2019-08-15 Thread Uoti Urpala
On Thu, 2019-08-15 at 20:36 +1000, Michael Chapman wrote:
> With systemd 239 I was unable to cause an fd leak this way.
> 
> Still, I would feel more comfortable if I could find a commit that 
> definitely fixed the problem. All of these experiments are just 
> circumstantial evidence.

5ae37ad833583e6c1c7765767b7f8360afca3b07
sd-bus: when attached to an sd-event loop, disconnect on processing errors


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd's connections to /run/systemd/private ?

2019-07-30 Thread Uoti Urpala
On Tue, 2019-07-30 at 14:56 -0400, Brian Reichert wrote:
> I see, between 13:49:30 and 13:50:01, I see 25 'successful' calls
> for close(), e.g.:
> 
>   13:50:01 close(19)  = 0
> 
> Followed by getsockopt(), and a received message on the supposedly-closed
> file descriptor:
> 
>   13:50:01 getsockopt(19, SOL_SOCKET, SO_PEERCRED, {pid=3323, uid=0, gid=0}, 
> [12]) = 0

Are you sure it's the same file descriptor? You don't explicitly say
anything about there not being any relevant lines between those. Does
systemd really just call getsockopt() on fd 19 after closing it, with
nothing to trigger that? Obvious candidates to check in the strace
would be an accept call returning a new fd 19, or epoll indicating
activity on the fd (though I'd expect systemd to remove the fd from the
epoll set after closing it).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] journald deleting logs on LiveOS boots

2019-07-19 Thread Uoti Urpala
On Thu, 2019-07-18 at 21:52 -0600, Chris Murphy wrote:
> # df -h
> ...
> /dev/mapper/live-rw  6.4G  5.7G  648M  91% /
> 
> And in the log:
> 47,19636,16754831,-;systemd-journald[905]: Fixed min_use=1.0M
> max_use=648.7M max_size=81.0M min_size=512.0K keep_free=973.1M
> n_max_files=100
> 
> Why is keep_free bigger than available free space? Is that the cause
> of the vacuuming?

The default value for keep_free is the smaller of 4 GiB or 15% of total
filesystem size. Since the filesystem is small and has less than 15%
free, it's already over the default limit. Those defaults are defined
in src/journal/journal_file.c. When over the limit, journald still uses
at least DEFAULT_MIN_USE (increased to the initial size of journals on
disk if any). But it looks suspicious that this is 1 MiB while 
FILE_SIZE_INCREASE is 8 MiB - doesn't this imply that any use at all
immediately goes over 1 MiB?

You can probably work around the issue by setting a smaller
SystemKeepFree in journald.conf.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] journald deleting logs on LiveOS boots

2019-07-18 Thread Uoti Urpala
On Mon, 2019-07-15 at 14:32 -0600, Chris Murphy wrote:
> So far nothing I've tried gets me access to information that would
> give a hint why systemd-journald thinks there's no free space and yet
> it still decides to create a single 8MB system journal, which then
> almost immediately gets deleted, including all the evidence up to that
> point.

Run journald under strace and check the results of the system calls
used to query space? (One way to run it under strace would be to change
the unit file to use "strace -D -o /run/output systemd-journald" as the
process to start.)


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Logged timestamps for service do not report correct times

2019-07-04 Thread Uoti Urpala
On Wed, 2019-07-03 at 23:51 +, Sam Gilson wrote:
> My hypothesis is that is that our service loops for a long time (30
> min usually) while waiting for a new DHCP lease to come in, and once
> that is completed, a time sync w/NTP occurs which causes the
> ExecMainStartTimestamp to be rewritten. Why I think this is the case

I do not believe NTP time changes are relevant. Try checking the
journal to see if something triggers the start of a process belonging
to the service at the time the timestamp is reset. Enabling debug
logging for systemd might help if you don't see anything at normal
level.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] ExecStartPre checking conf

2019-05-20 Thread Uoti Urpala
On Mon, 2019-05-20 at 11:56 +0200, Lennart Poettering wrote:
> about that though). Using ExecStartPre= for a syntax checker appears
> pretty pointless to me, as yes, you just end up doing the same work
> twice, and you might as well have the ExecStart= fail rather than the
> ExecStartPre=, there's little benefit in that.

I've seen people prefer type=forking over type=simple for the sake of
checking syntax before starting *depending* units. I'm not particularly
convinced myself that this is valuable, but whatever it's worth,
ExecStartPre should achieve the same goal (depending units are not
started if syntax check fails).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Question on Before=

2019-02-02 Thread Uoti Urpala
On Sat, 2019-02-02 at 15:03 -0500, Steve Dickson wrote:
> >   Have you enabled a.service?
> > 
> No... I did not think I had to... I figured 
> when b.service was started, a.service would be 
> run regardless of being enabled or disabled.
> 
> Is that not the case?

So you just have the file for a.service lying somewhere on disk, but
haven't enabled it and no other unit references it? That won't do
anything - systemd does not read through all files on disk to see if
there'd be something inside the file which declares that it should
actually be started. Units need to have something else referencing them
for systemd to "see" them at all. "enable" does this by creating a link
from the units/targets referenced in the [Install] section to the file
in question (by creating a symlink in 
/etc/systemd/system/multi-user.target.wants/ for example).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Conflation of propagation in dependencies creates race windows

2019-01-19 Thread Uoti Urpala
On Sun, 2019-01-20 at 00:34 +, Jonathon Kowalski wrote:
> On Sat, Jan 19, 2019 at 5:05 PM Uoti Urpala  wrote:
> > I think you're wrong here. It makes perfect sense that if unit A has
> > Requires= for another unit, stopping that required unit which A can't
> > work without will stop A too. Removing that logic is not a good
> > solution.

> I am NOT advocating for changing how Requires= currently works, and
> also, no many people by accident use Requires= for its semantics at
> startup.

So what exactly are you saying instead? That this case shouldn't use
Requires= (would be a bad idea, not an appropriate solution to the
actual problem)? That you made a mistake when you brought up this case,
nothing you say is actually relevant to it, and none of your proposed
changes would help solve this issue? Something else you haven't
explained?


> Also, this cannot work. Suppose I have Restart=on-failure in service,
> and service exits on its own normally. How will systemd decide X
> should not be stopped, in case an ExecStop* statement ends up failing,
> and then it *should* restart our service? All of this is going to be
> very racy and undeterministic.

Systemd could consider X "unneeded" only when Y is not active, has
nothing executing, and has nothing scheduled to execute. This does not
seem racy or undeterministic.

Anyway, the only realistic alternatives are to either restart X
automatically when Y restarts, or leave Y stopped after all. Restarting
Y while leaving X stopped is not a sane alternative (if the user wants
to allow that, it should be Wants= instead of Requires=). So this still
requires some solution, and nothing you have proposed would help at
all.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Conflation of propagation in dependencies creates race windows

2019-01-19 Thread Uoti Urpala
On Sat, 2019-01-19 at 15:54 +, Jonathon Kowalski wrote:
> https://github.com/systemd/systemd/issues/1154 which is similar in
> nature convinces me that systemd currently conflates two many
> properties in the same dependency. The second bug in particular would
> not happen if there was a version of Requires= that disabled the
> PartOf= stuff it currently has, i.e., pick and choose deps.

I think you're wrong here. It makes perfect sense that if unit A has
Requires= for another unit, stopping that required unit which A can't
work without will stop A too. Removing that logic is not a good
solution.

So the case is:

Service X has StopWhenUnneeded=true
Service Y has Requires=X, Restart=always

and the problem is that Y dying can in some circumstances stop X (due
to it being "unneeded" when Y is not actively running), and then
running this stop action on X stops Y completely too (so it will not
restart later), as if the administrator had explicitly stopped X.

I think the ideal behavior here is that X would never be stopped at all
if Y is scheduled to be restarted. Changes that would keep Y running
even if the administrator explicitly runs "systemctl stop X" would
definitely be wrong.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] dependency-only .service

2018-10-15 Thread Uoti Urpala
On Tue, 2018-10-16 at 00:47 +0200, Reindl Harald wrote:
> 
> Am 16.10.18 um 00:09 schrieb Johannes Ernst:
> > I have several programs A, B and C that, while they are running, require 
> > memcached.service to be running.
> > When none of A, B, or C is running, I want memcached.service to not run 
> > either.
> > A, B and C should share the same memcached instance.
> > 
> > How do I best express this?
> 
> https://www.freedesktop.org/software/systemd/man/systemd.target.html

I don't think there is any obvious way to solve this just by defining a
target.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] dependency-only .service

2018-10-15 Thread Uoti Urpala
On Mon, 2018-10-15 at 15:09 -0700, Johannes Ernst wrote:
> I have several programs A, B and C that, while they are running, require 
> memcached.service to be running.
> When none of A, B, or C is running, I want memcached.service to not run 
> either.
> A, B and C should share the same memcached instance.
> 
> How do I best express this?
> 
> I was thinking I would have a foo@.service, which would be started by A, B 
> and C as foo@A, foo@B, and foo@C right when they come up, and stopped before 
> they quit. This foo@.service would have a dependency on memcached.service, 
> but otherwise not do anything.

Why this indirection through "foo" instead of direct dependencies? Are
A, B and C not systemd services, so you require "foo" as a placeholder
that reflects their dependencies?

> 1. There isn’t a Type=Noop, so having an ExecStart=/bin/true might be my best 
> option?

I think a service with Type=oneshot and RemainAfterExit=true should
work with no ExecStart lines.


> 2. How do I get memcached.service to stop automatically? A Requires= seems to 
> keep it running even after all foo@.service have gone away.

Add StopWhenUnneeded=true to the configuration of the memcached
service.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd offline-update functionality

2018-10-15 Thread Uoti Urpala
On Mon, 2018-10-15 at 19:07 +, Zhivich, Michael wrote:
> I have not observed this behavior in testing; in fact, I cannot find
> any code in systemd that removes the “/system-update” symlink if the
> update script fails.  Is this still the expected behavior or is the
> update script responsible for deleting the symlink?

"git grep /system-update" in the sources shows:
units/system-update-cleanup.service:ExecStart=/bin/rm -fv /system-update

The "Recommendations" section on the page you linked to has "Make sure
to remove the /system-update symlink as early as possible in the update
script to avoid reboot loops in case the update fails." as the second
item anyway.

 
> The same documentation page recommends setting “FailureAction =
> reboot” in case the update script fails to complete.  However, this
> does not seem to reboot the machine properly (e.g. I don’t see a POST
> screen + grub as expected).  Using “reboot-force” does seem to have
> the expected behavior.  What is the difference between these two
> methods?

They're documented on the systemd.unit manpage for example. IIRC the
"force" variant just kills processes without "properly" shutting down
services through ExecStop actions.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Select on value of log message

2018-08-30 Thread Uoti Urpala
On Thu, 2018-08-30 at 19:32 +0200, Michael Biebl wrote:
> Am Do., 30. Aug. 2018 um 18:56 Uhr schrieb Uoti Urpala
> :
> > Keeping code using an up-to-date dependency disabled because other
> > packages haven't been updated seems backwards...

> I don't want to use systemd as a stick to beat people^maintainers
> with

I don't think "using it as a stick" is the right way to view it. My
view is just that you shouldn't consider it your problem if other pcre-
using packages haven't been updated. The functionality is useful enough
to enable on its own merits - it's not "using the package as a stick"
or anything like that.

If people want to make it a priority to only have one pcre lib, they
should work on updating the other packages. That doesn't mean you
should view depending on pcre2 to be "forcing" them to do that. Just
that if they don't, they'll have to live with 2 libs. And that's
perfectly reasonable, and preferable to the alternative of making it
your responsibility to update other packages or to hold back
dependencies on pcre2.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Select on value of log message

2018-08-30 Thread Uoti Urpala
On Thu, 2018-08-30 at 16:39 +0200, Michael Biebl wrote:
> Am Do., 30. Aug. 2018 um 15:50 Uhr schrieb Jérémy Rosen <
> jeremy.ro...@smile.fr>:
> > I *think* that it's deactivated in debian because journalctl is a
> > core package and debian doesn't want to pull the regex library into
> > it's core...
> > 
> 
> Right, no need to file another bug report. 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890265 

OTOH the older libpcre is currently necessary due to dependencies such
as grep, and the packages say people should migrate to libpcre2. So IMO
it'd be reasonable to add a libpcre2 dependency, and respond to any
complaints by saying that if people want to remove a pcre library, they
should migrate the packages using the old version, so it could be
removed instead.

Keeping code using an up-to-date dependency disabled because other
packages haven't been updated seems backwards...


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Select on value of log message

2018-08-29 Thread Uoti Urpala
On Wed, 2018-08-29 at 19:49 +0200, Cecil Westerhof wrote:
> There are a lot of ways you can select the output you get from
> journalctl, but it seems you cannot select on the message itself.
> That is why I need to do something like:
>  journalctl | grep 'Database is locked'
> 
> Is this true, or am I overlooking something?

Since version 237, journalctl has had a --grep option.

Message text is not indexed, so performance-wise this still requires
going through all messages (as limited by other options).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-timesync and journalctl questions

2018-08-24 Thread Uoti Urpala
On Fri, 2018-08-24 at 14:52 +0300, David Weinehall wrote:
> The second time-related issue pertains to journalctl.
> 
> It seems that journalctl logs (or at least displays) events in date/clock 
> order, not in
> sequence order. While this is definitely useful when trying to correlate 
> different logs
> against each other, it also means that events that happen after a date 
> adjustment might
> end up before already existing entries, thus breaking the sequentialness of 
> the log,
> as follows:
> 
> Date incorrect set to 2023:
> 
> Log message 1
> Log message 2
> 
> Date corrected to be 2018:
> 
> Log message 3
> Log message 1
> Log message 2
> 
> Typically this is not how we want our log to behave. Is there any way to
> show the log in sequential order?

Within one journal file, entries are stored in the order they are
received, and the normal tools like journalctl will also display events
from within the file in that order. Output like the above shouldn't be
the typical consequence of changing time during a single boot, as the
later entries should usually keep being written in the same file as the
previous ones. (It can happen, but your "reproduction steps" above are
likely incomplete as to what is actually required to see such effects).

Where you can see issues is between entries stored in separate journal
files. Those may not contain information that would allow comparing two
entries stored in the different files by anything other than the
unreliable realtime timestamp.

Note that the journal display code has some fairly serious breakage in
the area of handling messages that may have unreliable timestamps,
which may cause issues such as some messages not being shown at all.
The commit message of the proposed patch in the message linked below is
probably the best writeup (no fixes have been applied to this area
AFAIK):

https://lists.freedesktop.org/archives/systemd-devel/2017-December/039976.html


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] detect-virt: do not return exit failure code when the state is none

2018-05-24 Thread Uoti Urpala
On Thu, 2018-05-24 at 13:47 +0800, Lee, Chun-Yi wrote:
> Currently the systemd-detect-virt returns exit failure code when it
> detected none state. But actually the none state is still a valid
> state but not a process failed.
> 
> This patch changes the logic to return success code when the state
> is none. It can avoid that subsequent activity is blocked by the
> failure code of systemd-detect-virt process.

This is explicitly documented as "If a virtualization technology is
detected, 0 is returned, a non-zero code otherwise.". So other things
could be relying on the current behavior, and any attempts to change it
 despite that should at least modify the documentation too.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] `Found ordering cycle on _SERVICE_` loops due to `local-fs.target` with `Requires` instead of `Wants` for mounts generated from `/etc/fstab`

2018-05-09 Thread Uoti Urpala
On Wed, 2018-05-09 at 17:39 +0200, Michal Sekletar wrote:
> I was thinking that in addition to better log messages we could
> generate dot graph of the cycle and dump it to journal as a custom
> field. So people can then turn it into the picture and get a better
> understanding of the cycle. Do you think it would be helpful?

What information would the graph contain? The basic structure of a
cycle is always just a simple ring, and I don't see what benefit making
a graph of that would give over just listing the nodes in order.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-journald may crash during memory pressure

2018-02-09 Thread Uoti Urpala
On Fri, 2018-02-09 at 12:41 +0100, Lennart Poettering wrote:
> This last log lines indicates journald wasn't scheduled for a long
> time which caused the watchdog to hit and journald was
> aborted. Consider increasing the watchdog timeout if your system is
> indeed that loaded and that's is supposed to be an OK thing...

BTW I've seen the same behavior on a system with a single active
process that uses enough memory to trigger significant swap use. I
wonder if there has been a regression in the kernel causing misbehavior
when swapping? The problems aren't specific to journald - desktop
environment can totally freeze too etc.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Service to pause startup and wait for user input

2018-01-25 Thread Uoti Urpala
On Thu, 2018-01-25 at 17:24 +, Boyce, Kevin P [US] (AS) wrote:
> Does anyone know if there is a way to create a service unit that
> pauses early on in the boot sequence and asks the user a question?
> A reply would be required via keyboard.

There is no separate "pause" feature. You can likely implement the
wanted functionality with explicit dependencies - create a unit which
is only considered "started" when the question has been answered, and
make whatever other units you want to start only after that depend on
the new unit.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Again, why this strange behavior implied by "auto" in fstab ?

2018-01-25 Thread Uoti Urpala
On Thu, 2018-01-25 at 15:40 +0100, Franck Bui wrote:
> Sorry I was probably not clear: by "do the equivalent of "mount -a"
> during boot only" I meant to mount fs listed in fstab  (without
> "noauto")  the way it's done currently by systemd during boot.
> 
> During boot there shouldn't be any changes. The behavior change happens
> later if neither "auto" nor "noauto" is specified: by default do not try
> to automagically mount filesystems.

This would require distinguishing "boot" and "non-boot" modes of
operation, so that systemd could switch mount handling behavior at some
point. How would you define where "boot" ends? You could try a
definition like "boot is over if there is no unit that is scheduled for
start and has been since the initial transaction", but then if you have
some obscure service that takes 5 hours to start, this could still lead
to very surprising sudden behavior changes on a system that has
otherwise been running for hours...

And if you define boot based on when some software has completed
starting, and have any mounts which are not necessary for boot to
succeed, then there's an obvious race condition - sometimes the devices
would become visible and be mounted during boot, sometimes the boot
would complete first and the devices would not be mounted
automatically. I doubt that would be what you'd want. I think this
issue makes your "only during boot" goal a bad idea in general for such
non-essential mounts.

For mounts that are required as part of the initial transaction, some
kind of "only mount this automatically if it has not been mounted
before" logic (rather than based on "is it boot" directly) is the
closest I can think of to a sane feature. But still not something that
I'd consider a particularly good idea.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl start second.service first.service

2018-01-11 Thread Uoti Urpala
On Thu, 2018-01-11 at 15:34 +0100, Reindl Harald wrote:
> Am 11.01.2018 um 15:25 schrieb Uoti Urpala:
> > I'd guess this is due to systemctl starting each listed unit
> > independently rather than as a single transaction. Thus, the second
> > version first starts second.service without first.service being active
> > at all. Since there's only an After relationship, not Wants/Requires,
> > second.service will immediately start. Then systemctl starts
> > first.service; since second.service is already running, there's nothing
> > the After relationship could affect
> 
> but why?
> 
> the one "After=" should be enough to have a clear ordering of stop/start 
> both as it happens in shutdown/boot

At boot, both would be started as part of the same transaction (same
would happen here if you started a third.service that depended on both
first.service and second.service, then second.service would always
wait). Here second.service is just started individually, and systemd
has no idea at that time that first.service is going to be running at
all. Given that, it really can't behave any differently (it can't delay
the start of second.service to wait for first.service, when as far as
it knows first.service may well never get started at all!). It's only
after second.service is already running that it sees that first.service
will be started, and at that point it's too late to make second.service
wait. There really is nothing the init portion could do differently
given the semantics of bare "After" (the behavior could be changed in
the systemctl binary).

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl start second.service first.service

2018-01-11 Thread Uoti Urpala
On Thu, 2018-01-11 at 10:59 +, 林自均 wrote:
> I have 2 service unist: first.service and second.service. I
> configured "After=first.service" in second.service. Both services are
> "Type=oneshot".
> 
> If I execute:
> 
> # systemctl start first.service second.service
> 
> The ordering dependency will work, i.e. second.service will start
> after first.service.
> 
> However, if I execute:
> 
> # systemctl start second.service first.service
> 
> first.service and second.service will start at the same time.
> 
> Is this behavior documented somewhere? Thanks!

I'd guess this is due to systemctl starting each listed unit
independently rather than as a single transaction. Thus, the second
version first starts second.service without first.service being active
at all. Since there's only an After relationship, not Wants/Requires,
second.service will immediately start. Then systemctl starts
first.service; since second.service is already running, there's nothing
the After relationship could affect.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fragile journal interleaving

2017-12-14 Thread Uoti Urpala
On Thu, 2017-12-14 at 20:34 +0100, Lennart Poettering wrote:
> Well, the idea is that /* */ is used for explanatory comments and //
> is left for local, temporary, non-commitable stuff.
> 
> i.e. if you are testing some stuff, and want to comment out some bits
> briefly, use //, but if you add explanations for code that shall be
> commited then please use /* */. That way you can see very easily what
> is temporary stuff you should fix up before commiting and what should
> go in.
> 
> different projects have different rules there. Some people use "FIXME"
> or "XXX" for this purpose (and in fact its up to you if you do, after
> all this kind of stuff should never end up being commited upstream),
> but so far we just used C vs. C++ comments for this, and I think that
> makes sense. 

It doesn't really make sense. Most comments are until end of line, and
'//' is the natural match for that. That's what the comment feature in
most other languages does too. '/*' comments take more space, are more
effort to type, and cause extra problems with unterminated comments.
There can be variation in whether you use '/*' for larger block
comments etc, but IMO it's not at all reasonable to say that the
standard comment marker is suddenly reserved for "TODO" stuff only in
one project. I consider such banning of "random" features about as
reasonable as "in our project you must not use the newfangled a->b
operator, you have to write (*a).b" - that is, not reasonable at all.

I'm also generally opposed to arguments like "different projects have
different rules" for completely arbitrary rules with no real rationale.
Standardization and best practices have value. Again, this is IMO
comparable to the position of the systemd project. "Our project just
happens to require you do stuff differently or require old C dialects"
is analogous to "well our Linux distro just happens to use different
paths etc for no real reason beyond historical accident, you should
just deal with that".

I checked CODING_STYLE, and the entries that look like they're
ultimately purely the result of resistance to change without any valid
justification are:

- trying to prohibit // comments
- trying to prohibit free declaration placement - it's often better for
readable code, and the reason against it is really just not being used
to reading such code
- trying to prohibit variable declarations together with function
invocations. "int res = func(foo);" is perfectly reasonable code.
Again, just a matter of getting used to reading such code.

These could be replaced with more modern best practices, for example:

// This form implies you use the value of "i" after the loop
for (i = 0; i < n; i++)
   ...;

// Use this form if the value of "i" after the loop does not matter
for (int i = 0; i < n; i++)
   ...;

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fragile journal interleaving

2017-12-14 Thread Uoti Urpala
On Thu, 2017-12-14 at 10:35 +0100, Lennart Poettering wrote:
> On Do, 14.12.17 03:56, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > BTW current sd_journal_next() documentation claims "The journal is
> > strictly ordered by reception time, and hence advancing to the next
> > entry guarantees that the entry then pointing to is later in time than
> > then previous one, or has the same timestamp." Isn't this OBVIOUSLY
> > false when talking about realtime timestamps if you don't include
> > caveats about changing the machine time?
> 
> Well, but the philosophical concept of time beyond the implementation
> detail of Linux CLOCK_REALTIME is of course monotonic, hence I think
> it's actually correct what is written there. That said a patch that
> clarifies this would of course be very welcome...

Well if you talk about actual timestamp values then it's not true due
to  time changes. If you talk about the "physical reality" time when it
was logged, then it's true for a single journal file, but not true when
interleaving multiple journal files, as the interleaving process may
have to use the timestamps and they're not perfect (even ignoring the
issues discussed in this thread).


> > +if (j->current_location.type == LOCATION_DISCRETE)
> > +// only == 0 or not matters
> 
> Please use C comments, /* */, see CODING_STYLE.

That is a standard C comment... IMO it's quite silly to insist on this
kind of "This is not the C I first learned! I refuse to update the way
I do things!" attitude for systemd. "//" comments and free placement of
variable declarations are normal C and no more questionable than any
other random feature. In practice, the ONLY reason anyone opposes them
is that they were added at a later point. This is exactly the same kind
of resistance to change that underlies most "sysvinit worked just fine
for me, systemd is horrible" attitudes. So systemd in particular really
should do better.


> > +found = compare_with_location(f,
> > &j->current_location);
> 
> this returns an int, and you assign it to a bool. If the "int" was
> actually used in a "boolean" way that would be OK, but it actually
> returns < 0, 0, > 0, hence we should make the conversion explicit here
> and write this as "found = c_w_l(…) != 0".

Well that int/bool distinction was basically why I added the comment
above. I guess adding a redundant comparison is an alternative, matter
of taste mostly...

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] rpcbind.service: Not pulling the rpcbind.target

2017-12-14 Thread Uoti Urpala
On Thu, 2017-12-14 at 13:24 -0500, Steve Dickson wrote:
> 
> On 12/14/2017 12:48 PM, Uoti Urpala wrote:
> > On Thu, 2017-12-14 at 12:05 -0500, Steve Dickson wrote:
> > > +Wants=rpcbind.socket rpcbind.target
> > > +After=rpcbind.socket rpcbind.target
> > 
> > Is this needed when the service has socket activation support? If the
> > only interaction with it is through the socket, it shouldn't matter
> > even if the service is not actually up yet - clients can already open
> > connections to the socket regardless.
> 
> Well things are working as is... but this man page paragraph 
> was pointed out to me so I though these Wants and After were needed.
> 
> So you saying this patch is not needed?

I'm not familiar enough with rpcbind stuff to say with certainty that
it wouldn't be needed, but at least it seems plausible to me that it
would not be. The mechanism described on the man page is a way to
implement ordering if needed, but if the early availability of the
socket means ordering is never an issue, then it can be ignored.


> > And regardless, that "After" for rpcbind.target seems backwards.
> > Shouldn't it be "Before", so that the target being up signals that the
> > service has already been started?
> 
> I think this makes sense... So if the patch is needed I'll add
> Before=rpcbind.target and remove the target from the After=

Yes.

> > Not directly related, but if that comment is accurate and the socket
> > should be used "no matter what", perhaps that should be "Requires"
> > instead of "Wants" so that if the socket could not be opened for some
> > reason, the service fails instead of starting without socket
> > activation?
> > 
> 
> I was afraid of opening a can a worms here... :-) 
> 
> So you are saying Wants and After should be changed to 
> 
> Requires=rpcbind.socket
> Before=rpcbind.target

Depends on the exact semantics you want. "Wants" means that systemd
will try to start the socket if the service is started, but will
continue with the service start even if the dependency fails.
"Requires" guarantees that the service will never be started without
the socket active - if opening the socket fails, then the service start
will return failure too. If you know that the socket unit should always
be used, or the service will either fail or do the wrong thing without
it (such as open a socket with parameters different from what was
configured for the socket unit, and which the admin didn't expect) then
 Requires may be more appropriate.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] rpcbind.service: Not pulling the rpcbind.target

2017-12-14 Thread Uoti Urpala
On Thu, 2017-12-14 at 12:05 -0500, Steve Dickson wrote:
> According to systemd.special(7) manpage:
> 
> rpcbind.target
> The portmapper/rpcbind pulls in this target and orders itself
> before it, to indicate its availability. systemd automatically adds
> dependencies of type After= for this target unit to all SysV init
> script service units with an LSB header referring to the "$portmap"
> facility.


> diff --git a/systemd/rpcbind.service.in b/systemd/rpcbind.service.in
> index f8cfa9f..2b49c24 100644
> --- a/systemd/rpcbind.service.in
> +++ b/systemd/rpcbind.service.in
> @@ -6,8 +6,8 @@ RequiresMountsFor=@statedir@
>  
>  # Make sure we use the IP addresses listed for
>  # rpcbind.socket, no matter how this unit is started.
> -Wants=rpcbind.socket
> -After=rpcbind.socket
> +Wants=rpcbind.socket rpcbind.target
> +After=rpcbind.socket rpcbind.target

Is this needed when the service has socket activation support? If the
only interaction with it is through the socket, it shouldn't matter
even if the service is not actually up yet - clients can already open
connections to the socket regardless.

And regardless, that "After" for rpcbind.target seems backwards.
Shouldn't it be "Before", so that the target being up signals that the
service has already been started?

Not directly related, but if that comment is accurate and the socket
should be used "no matter what", perhaps that should be "Requires"
instead of "Wants" so that if the socket could not be opened for some
reason, the service fails instead of starting without socket
activation?

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fragile journal interleaving

2017-12-13 Thread Uoti Urpala
On Wed, 2017-12-13 at 18:40 +0100, Lennart Poettering wrote:
> On Mi, 13.12.17 00:43, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > correct position, so I don't see why the monotonicity discard would be
> > needed for that case either?
> 
> I figure you are right.
> 
> Any chance you can prep a patch for this change?

A patch with a fairly long commit message explaining the rationale and
related issues attached.

BTW current sd_journal_next() documentation claims "The journal is
strictly ordered by reception time, and hence advancing to the next
entry guarantees that the entry then pointing to is later in time than
then previous one, or has the same timestamp." Isn't this OBVIOUSLY
false when talking about realtime timestamps if you don't include
caveats about changing the machine time?
From 7070d7a3158cb0e0d14ba6167f7e0548a0022f55 Mon Sep 17 00:00:00 2001
From: Uoti Urpala 
Date: Thu, 14 Dec 2017 01:41:36 +0200
Subject: [PATCH] journal: fix lost entries when timestamps are inconsistent

When journalctl or the sd_journal_* API interleaved entries from
multiple journal files into one sequence, inconsistent ordering
information in the files could lead to entries being silently
discarded. Fix this so that all entries are shown at least in some
order, even if no guarantees are made about the order.

The code determining the interleaved order of entries is based on
pairwise comparisons, where a pair of entries is compared using the
first applicable of three progressively less reliable ways to compare:
first by sequence number if seqnum id matches, then by monotonic
timestamp if boot id matches, then by realtime timestamp (entries
within one journal file are always implicitly ordered by their order
in the file, which should match seqnum order unless the file is
corrupt). Unfortunately, while this method could be said to give the
"right" answer more often than just comparing by the always-available
realtime timestamp, it does not give a globally consistent order - if
the ways to compare do not agree, it's possible to have three entries
A, B and C such that Acurrent_location.type == LOCATION_DISCRETE) {
-int k;
-
-k = compare_with_location(f, &j->current_location);
-
-found = direction == DIRECTION_DOWN ? k > 0 : k < 0;
-} else
+if (j->current_location.type == LOCATION_DISCRETE)
+// only == 0 or not matters
+found = compare_with_location(f, &j->current_location);
+else
 found = true;
 
 if (found)
-- 
2.15.1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fragile journal interleaving

2017-12-12 Thread Uoti Urpala
On Tue, 2017-12-12 at 21:38 +0100, Lennart Poettering wrote:
> Maybe the approach needs to be that we immedately increase the read
> record ptr of a specific file by one when we read it, so that we know
> we monotonically progress through the file. And then change the logic
> that looks for the next entry across all files to honour that, and
> then simply skip over fully identical entries, but not insist on
> monotonic timers otherwise.

Isn't this pretty much what the code already does (except for the
"fully identical entries" part)? The next_beyond_location() function
already gives the next entry from a given file, and seems to use the
internal order of the file when doing that. So the only change would be
making the duplicate check only discard actually identical entries. And
now that I checked the surrounding code, it looks like even in the new-
file-added case you mentioned next_beyond_location() would call
find_location_with_matches(), which should seek the new file to the
correct position, so I don't see why the monotonicity discard would be
needed for that case either? 


> With that approach we can be sure that we never enter a loop, and
> we'll most filter out duplicates (well, except if duplicate entries
> show up in multiple files in different orders, but that's OK then,
> dunno).

You could probably make code loop by removing and adding back files
with the right timing, but that's not a very realistic case. So unless
there's some other part of the code I've missed, it looks like just
changing the discard to check for equality rather than non-monotonicity 
would be an improvement. That would still leave issues like the order
being "non-deterministic" (it depends on directory read order for files
for example) at least.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fragile journal interleaving

2017-12-12 Thread Uoti Urpala
On Tue, 2017-12-12 at 17:09 +0100, Lennart Poettering wrote:
> On Mo, 11.12.17 00:36, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > consider a clear bug: there's code in next_beyond_location() which
> > skips the next entry in a file if it's not in the expected direction
> > from the previous globally iterated entry, and this can discard valid
> > entries. A comment there says it's meant to discard duplicate entries
> > which were somehow recorded in multiple journal files (which I'd assume
> > to compare equal), but it also discards non-duplicate entries which
> > compare backwards from the previously shown one.
> 
> Note that two entries will only compare as fully identical if their
> "xor_hash" is equal too. The xor_hash is the XOR combination of the
> hashes of all of the entry's fields. That means realistically only
> records that actually are identical should be considered as such.

I assume that would be suitable for handling the case of actual
duplicates? How do those happen anyway?

 
> > code searching for earlier entries first found the early system
> > journal, then moved to the user journal because it had smaller seqnums,
> > and finally moved to some other file that started 5 days before the end
> > (earlier than the user journal and with a different seqnum id - there
> > didn't happen to be other earlier files later in the iteration order).
> > After printing one entry from there, the next_beyond_location()
> > "duplicate" code then discarded all the earlier valid entries.
> > 
> > I'm not sure about the best way to fix these issues - are there
> > supposed to be any guarantees about interleaving? At the very least the
> > "duplicate" check should be fixed to not throw away arbitrary amounts
> > of valid entries. Any other code relying on assumptions about valid
> > ordering? Is the interleaving order supposed to be stable independent
> > of things like in which order the files are iterated over?
> 
> So, the current "duplicate" check is not just that actually, it's also
> a check that guarantees that we progress monotonically, and never go
> backwards in time, even when journal files are added or removed while
> we iterate through the files.

If journal files can be added while iterating, I assume that could be
handled by special code run when such an event happens. IMO it's
important to have a basic guarantee that all entries in the journal
files will be shown at least in _some_ order when you try to iterate
through them all, such as with plain "journalctl". And thus the code
should make sure not to discard any entry unless it's confirmed to be a
full duplicate.


> I am not entirely sure what we can do here. Maybe this can work: beef
> up the comparison logic so that it returns more than
> smaller/equal/larger but also a special value "ambiguous". And when
> that is returned we don't enforce monotonicity strictly but instead go
> record-by-record, if you follow what I mean?

I don't see how that would help, at least not without some extra
assumptions/changes. In my example problem case above, the ambiguous
comparisons happen when deciding which file to get the first entry
from. There's no natural default "first file", so even if you only know
it's ambiguous you have to pick some anyway. If you pick the one the
current code does, the following discard check is not ambiguous - it's
discarding entries with earlier realtime and non-comparable other
values. Or do you mean that if an ambiguous comparison was EVER seen,
monotonicity would be permanently disabled? I don't really see an
advantage for that over just not enforcing monotonicity at all, and
handling any added-file special cases separately.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Fragile journal interleaving

2017-12-10 Thread Uoti Urpala
The code that tries to determine which of two journal entries is
"earlier" (similar logic in journal_file_comparare_locations() and
compare_with_location()) does this by trying different progressively
less reliable ways to compare - first by seqnum if seqnum id matches,
then by monotonic time if boot id matches, then by realtime timestamp.
Unfortunately, while this method could be said to give the "right"
answer more often than just comparing by the always-available realtime
timestamp, it does not give a valid ordering relation. As in, unless
you can already rely on all those comparisons giving the same
consistent order (when you could probably just use realtime anyway),
it's possible that you get a relation like Ahttps://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Question about service dependency handling in systemd-228

2017-11-25 Thread Uoti Urpala
On Sat, 2017-11-25 at 12:08 +0700, Bao Nguyen wrote:
> [   41.154231] systemd[1]: nss-lookup.target: Dependency 
> Before=nss-lookup.target dropped
> [   41.297229] systemd[1]: sockets.target: Found ordering cycle on 
> sockets.target/start
> [   41.297236] systemd[1]: sockets.target: Found dependency on 
> asi-My-5101.socket/start
> [   41.297239] systemd[1]: sockets.target: Found dependency on 
> My-sshd.target/start
> [   41.297241] systemd[1]: sockets.target: Found dependency on 
> My-syncd.service/start
> [   41.297244] systemd[1]: sockets.target: Found dependency on 
> My-nfs-client.service/start
> [   41.297246] systemd[1]: sockets.target: Found dependency on 
> My-handling.service/start


> My question is if there are any significant different about building tree 
> dependency and handling cycle dependency between systemd-210 and systemd-228 
> that can lead to my current situation? I have checked the change log, source 
> code but not found any useful info

Rather than start by trying to find differences between systemd
versions, I suggest you first find out exactly what goes wrong under
the newer systemd version. Exactly which dependency is wrong and
shouldn't be there? Where does that dependency come from? A system
where ordering dependencies form a cycle is not valid, so some
dependency explicitly listed in your unit files or implicitly added by
systemd must be wrong. After finding that out, you can then try to find
out what differs under the older systemd if it's still relevant.

In the above log, the most suspicious part is that it seems to say
"asi-My-5101.socket" depends on "My-sshd.target". A socket unit almost
certainly shouldn't have such dependencies, as normally a listening
socket can be opened regardless of the state of the rest of the system
(the main exception I can think of would be a UNIX socket at a
filesystem path that requires mounting something, but normally you
wouldn't do that...).


> And what does the message "nss-lookup.target: Dependency 
> Before=nss-lookup.target dropped" mean? I do not see it in systemd-210.

Apparently the target had a dependency saying that it should be started
before itself, and such a blatantly impossible dependency was ignored.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Uoti Urpala
Resend with correct list address
On Tue, 2017-07-11 at 12:00 +0200, Lennart Poettering wrote:
> On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> > Debian patch seems to be a cherry pick from upstream glibc. Is there
> > evidence of some error that would cause effects only visible on Debian
> > and nowhere else? And/or has the change been reverted or behavior
> > otherwise modified upstream to limit the range of relevant versions?
> 
> See the links Vito provided:
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> 
> i.e. Debian undid the PID caching to fix some issue that has been fix
> properly now, and hence the PID caching should be turned on again.

That seems backwards: the commit cherry-picked by Debian seems to be
c579f48edba88380635a, which is NEWER than above 0cb313f7cb0e418b3d56.
In other words, it seems 0cb313 was a failed attempt at a fix and the
patch cherry-picked by Debian was needed to properly fix things.

> On Fedora at least getpid() is not visible in strace, and is fully
> cached, as it should be.

Is that glibc 2.25? It seems to contain c579f48 at least; could be a
Fedora-specific change though?
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Uoti Urpala
On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> Normally it's dead cheap to check that, it's just reading and
> comparing one memory location. It's a pitty that this isn't the case
> currently on Debian, but as it appears this is an oversight on their
> side, and I am sure it will be eventually fixed there, if it hasn't
> already.

Are you sure about those "Debian only" and "will be 'fixed'" parts? The
Debian patch seems to be a cherry pick from upstream glibc. Is there
evidence of some error that would cause effects only visible on Debian
and nowhere else? And/or has the change been reverted or behavior
otherwise modified upstream to limit the range of relevant versions?

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Systemd license vs. libcryptsetup license

2017-06-08 Thread Uoti Urpala
On Thu, 2017-06-08 at 17:14 +, Zbigniew Jędrzejewski-Szmek wrote:
> On Thu, Jun 08, 2017 at 06:03:37PM +0200, Julian Andres Klode wrote:
> > I'm not sure where you get that from. The usual interpretation is that
> > linking to a GPLed library means the linked work must be GPL as well.
> 
> I don't think that's true. It only must have a compatible license.

I think that is the default FSF position. There are at least some cases
where it's likely not automatic (for example, if there's a widespread
API/ABI that is provided by both GPLed and differently-licensed
libraries, an executable that works with both seems to have at least a
reasonable claim to not being a derivative work). However, assuming
that using a library may make the executable a derivative work seems to
be the only safe default assumption.

If the only thing you know is that some code uses the library, that may
mean things like nontrivial inline functions being included in the
compiled code, or copy relocations copying arbitrary amounts of data
into an executable. It seems pretty clear that this can be considered a
derived work. So I don't think you can ever claim that GPL wouldn't
cover the linked work without at least some analysis of the specific
library in question and how it's used in the program.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Systemd license vs. libcryptsetup license

2017-06-08 Thread Uoti Urpala
On Thu, 2017-06-08 at 22:00 +0300, Uoti Urpala wrote:
> compiled code, or copy relocations copying arbitrary amounts of data
> into an executable. It seems pretty clear that this can be considered a

Rereading that, copy relocations are actually not that good an example
since the copying normally happens at runtime. So better consider just
inline functions.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] socket unit refusing connection when JOB_STOP is pending

2017-05-30 Thread Uoti Urpala
On Tue, 2017-05-30 at 18:07 +, Moravec, Stanislav (ERT) wrote:
> OK. Understood, thanks much!
> We'll try to follow up on using some parent process (xinetd or something like 
> that).

BTW your problem description wasn't very clear. Is your specific
problem case about socket activation of normal services (the issue
being that if the service has not been started before shutdown it won't
be, but things work if something did start it earlier) or about
Accept=true sockets that start a new service instance from template for
each connection? I'd guess that the latter might be somewhat easier to
support architecturally, though I'm not familiar enough with that to
say for sure.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] hanging reboot

2017-03-07 Thread Uoti Urpala
On Wed, 2017-03-08 at 13:05 +1300, Sergei Franco wrote:
> 
> The official ubuntu fix does not resolve the hang.
> The problem is that the unattended-upgrades script relies on /var/run
> being mounted, if the /var/ is a separate filesystem it gets
> unmounted and thus hanging the script (where it waits for the lock to
> be available).
> 
> 
> Here is the official ubuntu unattended-upgrade-shutdown unit:

> How does one would fix this unit so it is ran before the file systems
> get unmounted?

To answer your specific question, you could declare a dependency on
/var being mounted. However, if your above comment about it relying
only on /var/run being mounted is accurate, there is a better solution.
/var/run is just a legacy compatibility symlink to /run. Fix the script
to use the real /run path directly instead of /var/run, and there will
be no dependency on /var.

OTOH if that script actually installs packages then the installation
probably requires /var (and possibly other filesystems?) to be mounted,
and you should add RequiresMountsFor and other possible dependencies
required for package installation to work.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] rpcbind.socket failing

2016-11-02 Thread Uoti Urpala
On Tue, 2016-11-01 at 20:01 +0300, Andrei Borzenkov wrote:
> 01.11.2016 18:47, Lennart Poettering пишет:
> > DefaultDependencies=no now, which means you run in early boot, and
> > then things become more complex, as /var is not around fully yet. 
> > 
> 
> Unit file had RequiresMountsFor=/var/run. If this is not enough to
> ensure unit starts only after /var/run is mounted, then what exactly
> RequiresMountsFor does?

/var/run is not a mountpoint, it's an obsolete compatibility symlink to
/run. Thus that RequiresMountsFor is satisfied as long as /var is
mounted. The symlink is created by tmpfiles. Though even without
correct dependencies on tmpfiles, I'd expect the symlink to already be
around from previous boots on a persistent /var partition, so I don't
know why that'd cause a visible failure.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] rpcbind.socket failing

2016-11-02 Thread Uoti Urpala
On Wed, 2016-11-02 at 06:16 +0100, Kai Krakow wrote:
> Do you still use DefaultDependencies=no?
> 
> Then /usr is probably not available that early (now that it can start
> much earlier due to /run being available). What's the exercise of
> disabling default dependencies anyway?

/usr is always available, it's mounted together with the root fs
(normally by the initramfs if it is a separate partition).

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Transaction contains conflicting jobs 'restart' and 'stop'

2016-03-10 Thread Uoti Urpala
On Thu, 2016-03-10 at 17:51 +, Orion Poplawski wrote:
> Orion Poplawski  cora.nwra.com> writes:
> > 
> > # systemctl restart firewalld
> > Failed to restart firewalld.service: Transaction contains
> > conflicting jobs
> > 'restart' and 'stop' for fail2ban.service. Probably contradicting
> > requirement dependencies configured.

> It appears that this is a trigger for this issue.  Removing the
> conflicts=iptables.service removes it.  This seems like a bug to me
> though -
> why is iptables being brought in if the PartOf= is a one-way dep?

I guess it's because it's because firewalld.service has
"Conflicts=iptables.service", and thus (re)starting firewalld.service
stops iptables.service; fail2ban.service has PartOf to both, thus both
the restart and stop are propagated, and conflict.

Claiming a PartOf relationship to both of two conflicting services is
the problem here. I doubt such a use case was what PartOf was designed
to support.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd is trying to break mount ordering

2015-06-15 Thread Uoti Urpala
On Mon, 2015-06-15 at 13:24 +0200, Jan Synáček wrote:
> 
> 192.168.122.1:/srv/nfs /mnt/nfs nfs defaults 0 0
> /var/tmp/test.iso /mnt/nfs/content iso9660 loop,ro 0 0
> 
> Notice the last two lines. There is an NFS mount mounted to /mnt/nfs 
> and
> an ISO filesystem mounted into /mnt/nfs/content, which makes it
> dependent on the NFS mount.
> 


> Jun 15 10:37:55 rawhide-virt systemd[1]: firewalld.service: Found 
> dependency on local-fs.target/start
> Jun 15 10:37:55 rawhide-virt systemd[1]: firewalld.service: Found 
> dependency on mnt-nfs-content.mount/start
> Jun 15 10:37:55 rawhide-virt systemd[1]: firewalld.service: Found 
> dependency on mnt-nfs.mount/start
> Jun 15 10:37:55 rawhide-virt systemd[1]: firewalld.service: Found 
> dependency on network.target/start


> Isn't systemd trying to delete too many jobs while resolving the 
> cycles?

I don't think the cycle breaking is to blame. It's simple and only
considers one cycle at a time, but in this case I doubt there exists
any good solution that could be found. The cycle breaking only
potentially breaks non-mandatory dependencies (Wants). local-fs.target
dependencies on mounts and (probably, didn't check) dependencies
between mounts are "Requires", so the dependency that's arguably wrong
here cannot be broken. Once local-fs.target gets a hard dependency on
network the situation is already pretty bad, and you probably shouldn't
expect it to recover gracefully from that.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-218 - Requisite implies TriggeredByRestartOf

2015-05-19 Thread Uoti Urpala
On Tue, 2015-05-19 at 14:06 +0200, Lennart Poettering wrote:
> On Tue, 19.05.15 14:15, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > As for Evert's original problem, I think it's that RESTART is propagated
> > to all RequiredBy units unconditionally - even if those are currently
> > stopped! This affects both Requires= and Requisite= in exactly the same

> Hmm, so basically you are saying that currently RESTART is propagated
> as RESTART to all depending units, but you suggest that it should be
> propagated as TRY_RESTART? Did I get this right?

Yes, I think that should fix it.

I feel that a TRY_RESTART style true restart would be a more natural
base operation than the current "either start or restart" one, with the
START_OR_RESTART for "systemctl restart" semantics collapsing to either
START or RESTART. With that terminology, the logic would simply be that
RESTART is propagated, START is not.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-218 - Requisite implies TriggeredByRestartOf

2015-05-19 Thread Uoti Urpala
On Tue, 2015-05-19 at 01:26 +0200, Lennart Poettering wrote:
> On Tue, 19.05.15 00:55, Lennart Poettering (lenn...@poettering.net) wrote:
> > On Thu, 14.05.15 21:23, Evert (evert.gen...@planet.nl) wrote:
> > > According to the systemd documentation, Requisite disallows starting a
> > > unit unless the specified unit has been started. This seems to work
> > > fine, however, if the specified unit has been restarted, this unit will
> > > be started too!
> > > This is not what should happen and it doesn't happen with a stop and
> > > start of the specified unit, so clearly, restart behaves different than
> > > stop followed by start.

> > "systemctl stop" by the user. Now, as a special shortcut we currently map
> > "Requisite=" and "Required=" to a reverse dependency of
> > "RequiredBy=", which is problematichere, since we'll hence make no
> > distinction between the two when processing "systemctl stop".
> > 
> > I figure we need to split up the reverse dep here, and introduce a
> > seperate RequisiteBy= dependency so that we can avoid this problem...
> 
> Fixed in git. Please verify.

I think you're talking about something quite different than the problem
described by Evert. The bug report was about a currently stopped
depending unit being started by actions on a depended-on unit, which is
equally wrong for either "Requires=" or "Requisite=", and is no reason
for adding any new distinction between them! And the problem was also
about "systemctl restart", while "systemctl stop" worked as expected.

Did your change disable propagation of STOP from depended-on unit to the
one using "Requisite="? If so, I think that's wrong and should be
reverted - the documentation says Requisite is "similar to Requires=",
and gives no reason to expect that propagation in the direction from the
depended-on unit to the depending-on unit would be affected.

As for Evert's original problem, I think it's that RESTART is propagated
to all RequiredBy units unconditionally - even if those are currently
stopped! This affects both Requires= and Requisite= in exactly the same
way. The problem is not easily visible because systemd aggressively
garbage collects units that are not active, and RequiredBy only exists
when the requiring unit is loaded; this means currently stopped units
are usually hidden from the logic that could incorrectly start them. But
if there is some other reason to keep the unit loaded, then the bug is
visible.

A configuration like this should be sufficient to reproduce:
a1.service is arbitrary
a2.service has Requires=a1.service
a3.service has OnFailure=a1.service a2.service
Keep a3.service running to ensure that the OnFailure references keep
everything else loaded. Leave a2.service stopped. Restart a1.service;
this will start a2.service (via propagated restart), when it clearly
should not.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Deadlocks with reloading jobs which are part of current transaction

2015-05-01 Thread Uoti Urpala
On Mon, 2015-04-27 at 18:07 +0200, Lennart Poettering wrote:
> On Wed, 04.02.15 23:48, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > If you mean something like "systemctl restart --no-block
> > mydaemon-convert-config.service; systemctl reload mydaemon.service", I
> > don't see why you'd ever /expect/ this to work with any reload semantics
> > - isn't this clear user error, and will be racy with current systemd
> > code just as much as the proposed fix? 
> 
> Yupp, this is what I mean. (though I'd actually specify the --no-block
> in the second command too, though this doesn't make much of a
> difference...)


> > And in any case I'd consider the semantics of reload to be "switch
> > to configuration equal or newer than what existed *when the reload
> > was requested*", without any guarantees that changes from operations
> > queued but not finished before calling reload would be taken into
> > account.
> 
> The queue is really a work queue, and the After= and Before= deps
> dictate how the work can be parallelized or needs to be serialized. As
> such if i have 5 jobs enqueued that depend on each other, i need to
> make sure they are executed in the order specified and can operateon
> the results of the previous job.
> 
> I hope this makes sense...

After those clarifications I believe I now understand what kind of
example case you meant, and it does now seem a meaningful case to
consider; however, I still think that you're wrong, as your example case
turns out to work fine and is not actually a counterexample to the kind
of changes I was talking about.

If I understood correctly, you're talking about a case where service B
has "After=A.service", both A and B have queued jobs where the B job is
a reload, and the queued job for A might change the configuration for B
(so the reload needs to happen after that); and you're worried that
immediately returning success for the reload could create a violation of
the "after job A" requirement. Is this reload property of "After"
documented anywhere? The code does seem to apply it to reloads, but
systemd.unit documentation only starts about start/stop. Anyway, when
you consider what actually happens with my suggested change, it turns
out that even these "After" semantics for reload still work.

The situation where my changes would result in different behavior is
when B has a start job queued, but no code for B is running yet, and you
request a reload for B; current code waits for the start of B before the
reload is considered complete, whereas my change makes the reload return
immediate success. This does not actually change the semantics above:
the only difference is when the reload operation is CONSIDERED COMPLETE,
there is NO difference in what operations are actually run or in which
order! [1] Current code merges RELOAD to existing START and returns
success for reload after START has completed, whereas my change returns
success immediately; but both run exactly the same START operation with
the same ordering constraints, which already ensure that it happens
after A.service (START already has the ordering constraints from
"After="; merging the RELOAD to START does not add any additional
ordering that START would not already have had).

[1] So this difference only really matters when something blocks to wait
until the reload completes.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Boot ordering

2015-03-20 Thread Uoti Urpala
On Fri, 2015-03-20 at 10:24 +0300, Andrei Borzenkov wrote:
> On Fri, Mar 20, 2015 at 1:56 AM, Kai Krakow  wrote:
> > The point is: Let's just find out why the "intuitive" way to solve the OPs
> > problem doesn't work out and find the right solution. Let's face it: Trying
> > to use targets as sysvinit runlevels equivalent is obviously not the working
> > way although it looks promising and intuitive at first
> 
> sysinit.target and basic.target are exact equivalent of sysvinit
> runlevels - they are hard serialization points between groups of
> services so that systemd will not proceed with next group until all
> services in previous group are started. The difference is that these

No, they are not hard serialization points. As I already mentioned in
another part of this thread, it's perfectly possible for a service
pulled in by multi-user.target to run before basic.target completes. The
only reason that's not very typical is that most services use
DefaultDependencies=yes unless they're specifically designed for early
boot. But if a service has been written with DefaultDependencies=no (for
example because it _could_ be a dependency of some other early-boot
service in certain specific configurations) then it's quite normal for
it to start before basic.target, even if the service is only part of
multi-user.target.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Boot ordering

2015-03-19 Thread Uoti Urpala
On Thu, 2015-03-19 at 18:41 +0300, Andrei Borzenkov wrote:
> On Thu, Mar 19, 2015 at 6:11 PM, Michael Biebl  wrote:
> > The summary of my reply was "What you probably want, is hook into
> > basic.target or sysinit.target, use DefaultDependencies=no, and
> > specify the dependencies/orderings explicitly."
> >
> > Apparently, this didn't stick.
> >
> 
> The reality is, this question comes again and again; and the very fact
> that we had to add *-pre.target to allow such kind of ordering
> dependencies shows that the problem is real.

What exactly is your definition of "this question" and "the problem"? I
don't see any natural definition that would equally apply to what the
original poster was trying to do and things like local-fs-pre.target.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Boot ordering

2015-03-19 Thread Uoti Urpala
On Thu, 2015-03-19 at 14:27 +0100, Christoph Pleger wrote:
> >> Then, I still do not understand why my definition of a new target did
> >> not
> >> work. What is the difference between multi-user.target waiting for
> >> basic.target on the one hand and new.target waiting for basic.target and
> >> multi-user.target waiting for new.target on the other hand, aside from
> >> that one intermediate step?

You're misunderstanding some of the basics of unit ordering. That
multi-user.target has an "After:" relationship to basic.target only
means that multi-user.target ITSELF will not be considered to have been
successfully started before basic.target has. This does not say anything
about the ordering of any other units, such as the services that are
started because multi-user.target wants them - the reason why some
service is started at boot (such as which target pulls it in via a
"Wants/Requires" relationship) says NOTHING about where the service can
be ordered. If multi-user.target wants some service, it's up to the
individual dependencies of that service to determine when the service
can be started.

Typically most services started by multi-user.target run after
basic.target, but that's only because they each have the default
configuration "DefaultDependencies=yes", which implies "After:
basic.target". If some service has "DefaultDependencies=no" and defines
no other ordering requirements, it can even be the first service to run
at boot even if it's only wanted by multi-user.target.


Thus your "between basic.target and multi-user.target" is not a
well-defined requirement. My best guess about what you might actually
want to achieve (assuming you aren't so thoroughly confused that it
makes no sense at all) is a service that runs before any service that
has DefaultDependencies enabled, and which requires (most of)
basic.target. I think this would be most practically implemented as a
"DefaultDependencies=no" service, which is wanted by basic.target, and
which has explicit dependencies on (most of) other services that are
wanted by basic.target.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Deadlocks with reloading jobs which are part of current transaction

2015-03-10 Thread Uoti Urpala
On Wed, 2015-02-04 at 23:48 +0200, Uoti Urpala wrote:
> On Wed, 2015-02-04 at 21:57 +0100, Lennart Poettering wrote:
> > currently being started. You are suggesting that the reload can
> > suppressed when a start is already enqueued, but that's really not the
> > case, because you first have to run m-c-c.s, before you can reload...

> If you mean literally running "systemctl restart

...

> So unless I completely misunderstood your example, it seems that this
> does NOT demonstrate any problems with removing the blocking.


Discussion seems to have died again. How to proceed with fixing this? Is
there anything more I can clarify about why the current behavior (that
is, when service startup is queued, have "reload" requests block until
the service is up) is wrong and why the fix would be valid?


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Deadlocks with reloading jobs which are part of current transaction [was: [PATCH] Avoid reloading services when shutting down]

2015-02-04 Thread Uoti Urpala
On Wed, 2015-02-04 at 21:57 +0100, Lennart Poettering wrote:
> OK, let's try this again, with an example:
> 
> a) you have one service mydaemon.service
> 
> b) you have a preparation service called
>mydaemon-convert-config.service that takes config from somewhere,
>converts it into a suitable format for mydaemon.service's binary
> 
> Now, you change that config that is located somewhere, issue a restart
> request for m-c-c.s, issue a reload request for mydaemon.service.
> 
> Now, something like this should always have the result that your
> config change is applied to mydaemon.service. Regardless if
> mydaemon.service's start was queued, is already started or is
> currently being started. You are suggesting that the reload can
> suppressed when a start is already enqueued, but that's really not the
> case, because you first have to run m-c-c.s, before you can reload...

I do not see why that would cause any problems with removing the
blocking.

If you mean literally running "systemctl restart
mydaemon-convert-config.service; systemctl reload mydaemon.service" then
this should still work fine - the first "restart" will block until the
operation is complete and new config exists, and then the "reload"
guarantees that no old config is in use. However, I don't see why you'd
include the part about creating the new configuration via
mydaemon-convert-config.service in this case - the new configuration
already exists before any "reload" functionality is invoked, so why the
irrelevant complication of creating that configuration via another
service? It seems you are implicitly assuming some kind of parallel
execution of the restart and the reload?

If you mean something like "systemctl restart --no-block
mydaemon-convert-config.service; systemctl reload mydaemon.service", I
don't see why you'd ever /expect/ this to work with any reload semantics
- isn't this clear user error, and will be racy with current systemd
code just as much as the proposed fix? There are no ordering constraints
here, any more than there would be about starting two services and
expecting the first request to be started first. And in any case I'd
consider the semantics of reload to be "switch to configuration equal or
newer than what existed *when the reload was requested*", without any
guarantees that changes from operations queued but not finished before
calling reload would be taken into account.

So unless I completely misunderstood your example, it seems that this
does NOT demonstrate any problems with removing the blocking.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] sysv-generator: Skip init scripts for existing native services

2015-02-04 Thread Uoti Urpala
On Wed, 2015-02-04 at 22:02 +0100, Lennart Poettering wrote:
> On Wed, 04.02.15 21:26, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > Isn't this change also relevant to the creation of .wants symlinks, and
> > avoiding generating .wants links from the wrong targets?
> > 
> > As in, the case where you override a rcS.d sysvinit service with a
> > multi-user.target systemd unit (or other less common runlevel
> > combinations for distros that don't have any rcS.d level sysv any more).
> > You want to avoid generating a .wants symlink from an early boot target,
> > even if a generated unit file itself would be shadowed by the native
> > unit.
> 
> systemd does not support sysv scripts for early-boot targets
> anymore. This has been removed long ago.

Yes, but Debian patches rcS.d support back in because they still haven't
managed to create native units for every package. And as the comment in
parenthesis says, the same issue still exists in principle on other
distros with other runlevels (though is less common and important than
on Debian).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Deadlocks with reloading jobs which are part of current transaction [was: [PATCH] Avoid reloading services when shutting down]

2015-02-04 Thread Uoti Urpala
On Wed, 2015-02-04 at 19:36 +0100, Lennart Poettering wrote:
> On Wed, 04.02.15 20:19, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > You're missing an essential point here: there's a distinction between
> > skipping reloads for services which have not not been dispatched, and
> > skipping reloads for services for which startup code is already running
> > (and may be using existing configuration) but which have not reached
> > full "running" status yet.
> > 
> > The former is the correct behavior (but currently handled wrong by
> > systemd!), and never causes races. Only the latter can cause races like
> > described above.
> 
> These two cases aren't that different. If somebody pushes an
> additional job into the queue that wants to run before the reload but
> after the service is up you cannot ot flush out the reload just
> because the service has not started yet...

I cannot parse what you're trying to say here, if it's anything
meaningful. Your "wants to run before the reload" sounds like you're
talking about guaranteeing that a reload NOT happen before something
else runs, but that would be nonsense - the "guarantee" would guarantee
nothing semantically relevant (if systemd only starts executing the
service binary *after* the reload has been queued, it cannot use any
pre-reload-order config at any point; there's no "guaranteed to use old
config" guarantee of any form possible!).


> Whether you change config in your current context, or you do so from a
> new unit's context is no difference: we cannot move anything that is
> supposed to happen after that change before it, and we cannot remove it
> either...

If no code from a service is currently running, it's already guaranteed
that every request issued to the service in the future will use the new
config (no old state exists, and any newly started process will
obviously load the new config). Thus the requirements for a reload are
already fulfilled; the operation is complete, and there is nothing more
to do. Unnecessary waiting only causes deadlocks for no benefit
whatsoever.

> There are some forms of coalescing possible, but we already do all of
> the ones that are safe...

This is not exactly "coalescing" - it's just immediately returning
success if there is no service code running (either in "running" state
or in startup state where a process already exists and could have read
the old config before it was changed).

Removing the current incorrect blocking and returning success
immediately is 100% safe, in the following strictly defined sense:
All requests handled by the service after "systemctl reload" has
returned will use a version of config equal or newer than the one that
was in effect when the reload call was started.

If you still want claim that removing the blocking would not be safe,
please try to construct a sequence of operations where such non-blocking
behavior would lead to failure (failure defined as: the service
processes a request using configuration older than what existed when
"reload" was requested). I'm confident that it is impossible to
construct such a counterexample.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] sysv-generator: Skip init scripts for existing native services

2015-02-04 Thread Uoti Urpala
On Wed, 2015-02-04 at 15:06 +0100, Martin Pitt wrote:
> Lennart Poettering [2015-02-04 13:42 +0100]:
> > Well, but their enablement status so far is not ignored. i.e. if you
> > drop in a unit file, as well as a sysv script, and the latter is
> > enabled, but the former not, then systemd currently reads that so that
> > the sysv one is overriden by the native one, and the native one is
> > considered enabled.
> > 
> > With this change you alter that behaviour. Is that really desired?

> So in that regard it would be an intended behaviour change indeed.
> But either way this is a corner case for sure. I just wouldn't like to
> carry this patch forever as it's relatively unimportant.
> 
> Maybe Jon can chime in about his intentions with this?

Isn't this change also relevant to the creation of .wants symlinks, and
avoiding generating .wants links from the wrong targets?

As in, the case where you override a rcS.d sysvinit service with a
multi-user.target systemd unit (or other less common runlevel
combinations for distros that don't have any rcS.d level sysv any more).
You want to avoid generating a .wants symlink from an early boot target,
even if a generated unit file itself would be shadowed by the native
unit.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Deadlocks with reloading jobs which are part of current transaction [was: [PATCH] Avoid reloading services when shutting down]

2015-02-04 Thread Uoti Urpala
On Wed, 2015-02-04 at 16:38 +0100, Lennart Poettering wrote:
> On Wed, 04.02.15 15:25, Martin Pitt (martin.p...@ubuntu.com) wrote:
> > Lennart Poettering [2015-02-04 13:27 +0100]:
> > > On Wed, 04.02.15 08:56, Martin Pitt (martin.p...@ubuntu.com) wrote:
> > > >  - Don't enqueue a reload if the service to be reloaded isn't running.
> > > >E. g. postfix.service "inactive/dead" in
> > > >https://bugs.debian.org/635777 or smbd.service "start/waiting" in
> > > >https://launchpad.net/bugs/1417010.  This would completely avoid
> > > >the deadlock in most situations already, and doesn't change the
> > > >semantics for working use cases either, so this should even be
> > > >applicable for upstream?
> > > 
> > > No, this would open up the door for races. The guarantee we give
> > > around blocking operations, is that by the time they return the
> > > operation or an equivalent has been executed. More specifically, if
> > > you start a service, and it is in "starting", and then issue a
> > > "reload" or "restart", and it returns you *know* that the
> > > configuration that was on disk at the time you issued the "reload" or
> > > "restart" -- or a newer one -- is in place. If you'd suppress the
> > > reload/restart in this case, then you will not get that guarantee,
> > > because the configuration ultimately loaded might be the one from the
> > > time the daemon was first put into starting mode at.

You're missing an essential point here: there's a distinction between
skipping reloads for services which have not not been dispatched, and
skipping reloads for services for which startup code is already running
(and may be using existing configuration) but which have not reached
full "running" status yet.

The former is the correct behavior (but currently handled wrong by
systemd!), and never causes races. Only the latter can cause races like
described above.

Fixing the systemd semantics should fix most of the bootup deadlock
cases. This is not a "sysv workaround" or anything like that. The
current systemd semantics are wrong and undesirable for new code,
regardless of any legacy compatibility issues. Fixing them would give
semantics that are more logically correct and work better in practice.

IIRC the smbd.service case was just a buggy circular service definition
and can not be reasonably fixed by any systemd-side semantics change; if
I remember that correctly, it should not be used as an example in any
discussion of systemd-side changes.


> > Hm, I don't quite understand this. If you reload a service which isn't
> > currently running, systemctl will fail. It's just currently going to
> > enqueue the reload request, and executing the job will fail due to the
> > "not active" check in unit_reload(). The problem with that is just
> > that systemctl doesn't fail immediately with "the unit is not active",
> > but blocks on the queued request. So if you don't have pending jobs,
> > the net result is the same, but with pending jobs you can easily get a
> > deadlock.
> 
> Again, if a "start" job is already queued and in progress of being
> dispatched, we need to queue the reload job, to get the right
> guarantess.
> 
> It's not that hard to see, is it?

You're correct about the "in progress of being dispatched" case, but the
problem case is when systemd incorrectly blocks when no client side code
using old configuration has been actually dispatched yet (only a queued
start inside systemd). The latter systemd misbehavior causes the
deadlocks. The "in progress of being dispatched" case typically does not
cause deadlocks, because the already running startup will normally
finish without blocking on anything, and then the new reload can run, so
there's no lock.


> We must execute the jobs in order. If there's start job in progress,
> or a reload job, and we queue a reload job, then we need to wait for
> the start job or reload job to finish, to begin with the new reload
> job. Otherwise you cannot give the guarantee I pointed out above.

Again, that's literally correct but talking about the wrong case. The
relevant case is the one where there ISN'T a "job in progress", only a
queued one.


This was discussed before, last mail being:
http://lists.freedesktop.org/archives/systemd-devel/2014-October/024612.html
(no replies to that one).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] util: fix strict aliasing violations in use of struct inotify_event

2014-12-22 Thread Uoti Urpala
On Mon, 2014-12-22 at 00:17 -0800, Shawn Paul Landden wrote:
> There is alot of cleanup that will have to happen to turn on
> -fstrict-aliasing, but I think our code should be "correct" to the rule.

> -uint8_t buffer[INOTIFY_EVENT_MAX] _alignas_(struct 
> inotify_event);
> +union {
> +struct inotify_event ev;
> +uint8_t raw[INOTIFY_EVENT_MAX];
> +} buffer;

I don't think the union is really necessary for correctness here. Access
through a character type may legally alias access by any other type.
Strictly speaking, "character types" are "char", "signed char" and
"unsigned char", and is not required to include uint8_t, but I don't
think this is an issue in practice. Thus the existing code should be
strict-aliasing safe unless I'm missing something.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] /usr vs /etc for default distro units enablement

2014-12-05 Thread Uoti Urpala
On Fri, 2014-12-05 at 02:39 +0100, Lennart Poettering wrote:
> On Tue, 02.12.14 20:02, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > On Tue, 2014-12-02 at 01:51 +0100, Lennart Poettering wrote:
> > > On Tue, 18.11.14 16:09, Michael Biebl (mbi...@gmail.com) wrote:
> > > > WantedBy=multi-user.target
> > > > 
> > > > and version B has
> > > > [Install]
> > > > WantedBy=foo.target
> > > 
> > > Package installs should probably not try to do something about this
> > > case, just leave things as is.
> > 
> > I think this would be a very bad idea. Installing a system and then
> > upgrading it should give you the same state as installing a newer system
> > directly; silently leaving outdated configuration in use almost ensures
> > that systems will fail/degrade over time.
> 
> Well, things are not that easy.
> 
> If distro Foobarix decides one day that from this day on sendmail
> should be enabled again by default, what is the "right" upgrade path
> for old installs? Should it be enabled now, because that's the new
> default for new installs? Or should it stay disabled, since that's
> what the user is accustomed to?

The context here was a package changing its install target, not changing
default enable/disable behavior as in your example. The latter is a less
clear-cut case: if the unit has a [Install] section, then presumably the
packager considers both enabled and disabled state supported at least to
some degree, and thus both are explicitly valid choices even on newly
installed systems. By contrast, leaving symlinks from targets that do
not match the [Install] section of the current service file anymore is
more arbitrary reconfiguration, which cannot be expected to work in
general (as in linking arbitrary units from arbitrary targets is not
expected to work), and it's the admin's responsibility to investigate
what he needs to do to make such configurations work and keep them
working.

Keeping such obsolete configuration would mean that systems rot over
time. Package configuration files are not handled that way, and package
startup configuration shouldn't be either, for the same reasons.

Just leaving the symlinks would not give good behavior even in the case
where the admin wants to keep the old target: temporary disable + then
re-enable would now change the target. Perhaps the recommended way to
change targets in local configuration should be to override the
[Install] section, instead of just leaving different symlinks?


> units towards statically enabled ones anyway. But again: if something
> was optional before, and is optional after, then be conservative,
> don't change things...

IMO in the changing-targets case it's not "conservative" at all to
silently leave the system in a state where it has obsolete configuration
which does not match anything supported by the current packaging. Such
behavior is almost 100% guaranteed to break things at some point.
"Conservative" would be something like refusing to upgrade until the
admin resolves possible conflicts manually. If no local configuration
can be detected, using the new packaging defaults has a better chance of
avoiding breakage than leaving obsolete configuration.


> Sure, if you know that changes in your unit files actively break a
> previous default, then you should do something about it, but I think
> cases like this are best handled with careful, manually written
> postinst scripts, that do the right thing in the most defensive
> possible way. But blindly enabling all kinds of stuff just because the
> upstream default changed is really not a good idea I think.

The new defaults could enable more things, or they could disable parts
that are now deemed insecure or unnecessary, or just generally fix bugs.
The sane default assumption is that later package versions are better
than earlier ones, and leaving the system using old default
configuration is worse than new configuration. And that's assuming the
old configuration even works anymore given changes elsewhere.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] /usr vs /etc for default distro units enablement

2014-12-02 Thread Uoti Urpala
On Tue, 2014-12-02 at 01:51 +0100, Lennart Poettering wrote:
> On Tue, 18.11.14 16:09, Michael Biebl (mbi...@gmail.com) wrote:
> 
> > 2014-11-18 15:59 GMT+01:00 Colin Guthrie :
> > > For the avoidance of doubt, I believe that running systemctl preset
> > > should only ever happen on *first* install, never on upgrade or such like.
> > >
> > 
> > And what are you going to do, if the unit file changes?
> > Say v1  had
> > 
> > [Install]
> > WantedBy=multi-user.target
> > 
> > and version B has
> > [Install]
> > WantedBy=foo.target
> 
> Package installs should probably not try to do something about this
> case, just leave things as is.

I think this would be a very bad idea. Installing a system and then
upgrading it should give you the same state as installing a newer system
directly; silently leaving outdated configuration in use almost ensures
that systems will fail/degrade over time.

> I mean, let's not forget that admins should be able to define their
> own targets and then enabled units in them and disable them
> elsewhere. Package upgrades should not manipulate that. The first
> installation of a package should enable a unit if that's what the
> preset policy says, but from that point on the configuration is admin
> configuration and not be changed anymore automatically.

It's theoretically possible that the admin could have moved it to a
different target, but that's the exception. The system should still
sanely handle the normal case where the admin has changed nothing, and
in that case leaving the unit in its original target is completely
wrong.

For example, if you move a unit from sysinit.target to multi-user.target
and remove "DefaultDependencies=no" from the .service file, then leaving
the installed symlink for sysinit.target will cause a dependency loop.
Even in cases where the resulting configuration is not so obviously
broken, differences from the behavior of new installs are always
harmful.

If the admin HAS changed the configuration, then silently ignoring
package changes is still not good behavior: it's likely that the admin
should at least check whether the local changes are still required/valid
and fix the setup to match the new package behavior if needed.

So overall, I think the rules for package upgrades should be:
In the default no-local-changes case, package upgrades MUST change
symlinks to match what a new package install would set.
If local configuration changes can be detected, then the admin should be
alerted to the need to check the configuration if possible.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

2014-10-27 Thread Uoti Urpala
[Resending to the list, as it seems recipients were wrong in the first attempt]

The discussion on this died down. I'm bringing this back up as it's IMO
quite a significant problem.

To recap:

The core issue is that if a start job is queued for foo.service,
"systemctl reload foo.service" blocks until the service is started and
then ready. This is wrong: the blocking behavior is unlikely to be
useful for any real use case, but it does cause real deadlock issues and
distros already have to work around it.


The main argument in favor of this misbehavior was that if you issue a
reload, you do it with the intent to use the service, and thus it would
be positive to ensure it is usable after such a command. I think that in
practice this is not true: neither would it be a good idea to write code
that relies on such blocking, nor are people likely to do things that
way in practice (good idea or not). As in, people shouldn't, and likely
won't, write code with semantics like the following:

systemctl start --no-block foo.service
systemctl reload foo.service # let's do the blocking here!
# here we can rely on the service being up

In sane code, if you don't want to change the operation of the service,
you should be able to skip the reload call and things shouldn't break.
Any such sane code does not benefit from the extra blocking.


The blocking is actively harmful because it can cause deadlocks. One
case where this is especially likely is during boot in hook code that
changes the configuration of some service. The hook does not know
whether other components intend to use the service afterward or not.
Thus it should generally ensure that the reload is complete before
returning, and not use no-block. But if the hook is called early in the
boot and blocks, this can prevent the later service that reload is
called on from ever actually starting.


IMO the correct way to view this issue is that "configuration of service
X is guaranteed to say Y" and "service X is up" are orthogonal states.
There are several situations where it makes sense to write code that
deals with the first state only; mixing in waits for the other state to
reach a particular value only delays things unnecessarily at best or
causes deadlocks at worst.



___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] I wonder… why systemd provokes this amount of polarity and resistance

2014-10-21 Thread Uoti Urpala
On Wed, 2014-10-22 at 02:13 +0200, Martin Steigerwald wrote:
> With that I perceive starts an answer on a technical matter ends with what I 
> received as a dire personal attack: I.e. calling me names.

I think it was a mostly justified criticism of your posting style here.

> I will make an effort to reply to your mail and then most likely unsubscribe, 
> cause for me I feel like being in an hostile environment.

If you post such strongly worded criticism of people's work (which I
don't consider really justified criticism either) then IMO you should
tolerate that level negative response without starting to complain about
"hostile environment".


> Upstream systemd has a very high development speed. Which you may view as 
> good. And heck, yes, it has its advantages I agree. But to me it also seems 
> that this speed partly come due to what you wrote above as the easy way of 
> developing things. And that easy way to develop things, I argue now, makes it 
> more difficult for people wanting to port to different platforms, people only 
> wanting a subset of systemd and people who want to adapt systemd.

Those latter are much smaller groups than the number of people who just
need a well-working init system for their Linux machine. It wouldn't
make sense to sacrifice the functionality of init just to make porting
easier.


> > > systemd provides more and
> > > more functionality that desktops like to use, that other tools like to
> > > use.
> > > 
> > > When Microsoft back then did something like this it was called "Embrace,
> > > Extend and Extinguish"¹…

> > > Really… it matches quite closely.
> > 
> > Oh come on! This is just an attack and FUD. You make repeated claims of
> > coming in good faith etc. and seem to dismiss any technical defence
> > being made with vague references and then you bring out a aggressive and
> > argumentative statement like the above.
> 
> That is the impression you get.
> 
> I think I replied to technical arguments as well.

The above does not match the definition of "Embrace, extend and
extinguish" (see for example the Wikipedia definition at
http://en.wikipedia.org/wiki/Embrace,_extend_and_extinguish
). It's a lot more specific than just "a product manager to push
competing ones out of the market", and pretty much requires intentional
malice to apply. IMO it was quite accurate to call your claim
attack/FUD.


> What I tried to achieve is to describe and interpret what I see regarding the 
> state of systemd as I understand it now, and granted my understanding may not 
> be complete, sure, and also describe and interpret behavior I have seen. And 
> also summarize some of this from the feedback I read elsewhere.
> 
> What I didn´t try to achieve was attacking persons.
> 
> Yet, I interpret your reaction to me as if I attacked you.
> 
> So I am obviously not producing the outcome I wanted to produce. And thats 
> one 
> reason why I think I will stop doing what I am doing after this mail and 
> unsubscribe from this list for a while.
> 
> Actually I think I made my point. I tried to channel some of the dire 
> concerns 
> and uproar and polarity and split tendencies upstream.
> 
> I see this happening to my beloved distribution Debian and I am not happy 
> about it. The systemd debates and polarity within Debian I consider as being 
> harmful.
> 
> And it was my intention to address some of this upstream in order to discuss 
> what can be done to first *understand* why it triggers this polarity and what 
> can be done to address this.

Maybe your *intention* was to address reasons for controversy in a
constructive manner, but I do not think you succeeded very well. Several
of your points had already been made by others before - many, many
times. You bring up little that systemd authors would not have already
addressed before. Things like your "Embrace, extend and extinguish"
comparison above are attacks with little constructive content. And when
presented with technical justification to develop certain things in the
same project, or keep certain functionality in PID 1, you seem to
largely ignore it. Yes, it is a tradeoff, and you can always find some
negative side. But you won't achieve anything by ignoring the answers
and talking about the negative sides, if you can't make a better
argument why the tradeoff would be wrong overall.


> > Of course this criticism is listened to and often actions are taken
> > because of it, but what do you expect the outcome to be? Do you expect
> > all the repos to be split up? Stable APIs exported and supported? What
> > outcome do you actually *want* here? You seem to be doing lots of
> > complaining, but very little in the actual suggesting what could be done
> > different that has not already been addressed technically. You may
> > disagree about that end decision but that's just the way it is
> > sometimes? The people doing the work ultimately get to make the decisions.
> 
> Yeah, thats the do-ocracy aspect of things. Still if what I do aga

Re: [systemd-devel] I wonder… why systemd provokes this amount of polarity and resistance

2014-10-07 Thread Uoti Urpala
On Tue, 2014-10-07 at 14:15 -0400, Rob Owens wrote:
> My question really isn't "why are the Debian dependencies the way they are".  
> I understand that.  I was trying to highlight the strange situation of a 
> desktop application requiring a particular init system.  I *think* this is a 
> result of the init portion of systemd being bundled together with the logind 
> portion of systemd.  To me (unfamiliar with the systemd code) those two 
> functions seem distinct enough to merit being separate.  Debian can't easily 
> separate what the systemd devs have developed as a single binary, so we end 
> up with these strange dependency chains.

"Single binary" is false of course. Logind is developed as a separate
program, which is why systemd-shim is possible at all.

AFAIK the actual relevant dependencies go as follows: First, there's a
good reason why logind requires cgroup functionality. And there's a good
reason why cgroup functionality is best implemented together with init
(see
http://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
for more info). So it's not quite directly "logind has to depend on
systemd as init", but "logind has to depend on the system having cgroup
support, and there's no equally good cgroup support available for inits
other than systemd". It is possible to provide the relevant cgroup
interfaces in or on top of another init, as the systemd-sysv + cgmanager
combination attempts to do. But it is not trivial to do, as bugs and
implementation delays with that combo have shown, and it's quite likely
that there will be more problems in the future. It's not a particularly
good idea to use the less-tested and less-reliable systemd-shim instead
of the more reliable systemd. Thus the overall result is that yes, it
does make sense to switch machines to systemd when you add certain
functionality, even if that functionality does not appear to be directly
tied to the init system at first glance.


> I never thought much about my init system until recently.  I never really had 
> any complaints with SysV init, although I do recognize that systemd provides 
> real improvements for some use cases.  So for me as a sysadmin the wisest 
> thing to do is stick with what I already know, as long as it's working well 
> enough (and for me, it is).

The issue with "I should be able to stay with sysvinit because it worked
fine for me" is that keeping sysvinit working is COSTLY. The reason
sysvinit used to mostly work was not that it would have been a reliable
system - it mostly worked because people kept using a lot of effort to
work around and paper over various issues that kept popping up. And
there's no justification to keep up that effort for the minority who
wants to stay with sysvinit. So, you can keep running your old systems
unchanged if you want, but you shouldn't expect to be able to upgrade
them or install new software without switching to systemd.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] variable expansion in ExecStart

2014-10-04 Thread Uoti Urpala
On Sat, 2014-10-04 at 21:24 +0200, Zbigniew Jędrzejewski-Szmek wrote:
> Environment="X='Y' Z"
> ExecStart=/bin/echo $X ${X}
> 
> results in echo[31266]: Y Z 'Y' Z
> 
> i.e., $X not only splits at whitespace, as documented, but also strips quotes.
> Is this by design, or is it an implementation accident? Should this behaviour
> be changed?

Isn't the current behavior with quotes the only way to pass an arbitrary
number of arguments that possibly contain whitespace? If you want $FOO
to expand to the argument list ["a", "b c"] you can currently express
that as "a 'b c'". If you remove unquoting, there is no alternative
syntax, is there?


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] I wonder… why systemd provokes this amount of polarity and resistance

2014-09-21 Thread Uoti Urpala
On Sun, 2014-09-21 at 15:31 +0200, Martin Steigerwald wrote:
> Did you ever ask yourself why your project provokes that amount of resistance 
> and polarity? Did you ever ask yourself whether this really is just 
> resistance 
> against anything new from people who just do not like "new" or whether it 
> contains *valuable* and *important* feedback?

I think that in general the existence of significant amounts of
resistance is explained by opposition to anything new. Systemd changes
many things, and as distros won't keep support for sysvinit at the same
time, people can't easily keep using the old stuff they're used to.
That's enough to explain complaints, and their existence does not by
itself mean there would be anything wrong with systemd.


> For now I use systemd. I like quite some features. But on the other hand I am 
> vary about it myself. I look at a 45 KiB binary for /sbin/init as PID1 and a 
> 1,3 MiB binary in systemd 215 and wonder myself.

Sysvinit as PID 1 lacks many essential things, so that is not a valid
size comparison (and just having the code running with a PID other than
1 is not an improvement).

In general, any complaints about the size/"bloat" of PID 1 are rather
silly if you still use it with the Linux kernel, which contains a lot
more code in a more central role than PID 1.


>  I had it that it didn´t mount an NFS export and while in the 
> end it was a syntax error in fstab that sysvinit happily ignored, I needed a 
> bug report and dev help to even find that cause. I wonder about the 
> complexity 
> involved in one single large binary.

I think this works as an example of how change leads to people
complaining, completely unrelated to the existence of any actual quality
issues. Sysvinit behavior or debuggability wrt mount issues was not
better than systemd is, much less by so much that this would illustrate
any general issue with systemd (that's not to say that systemd
diagnostics could not be improved). Yet because you were first familiar
with sysvinit and had created dubious configuration which happened to
work with it, you now feel this is a problem in systemd, just because
things have changed. Someone who started with systemd and used it for
years before encountering sysvinit would hit a lot more problems.


> Well… its not about my thoughts about systemd, it is about my perception that 
> I never seen any free software upstream project creating this amount of 
> polarity and discussion in a decade or more.

I don't think the reactions to systemd are in any way unique. I've seen
similar reactions to other changes. The difference in the systemd case
is that a lot of developers interact with the init system at least on a
superficial level, and init system choice is mostly done on the distro
level, so people can't easily ignore systemd and keep using their old
software. That increases the volume of the complaints.


> Is it really all just nay-sayers for the sake of nay-saying? Or do they – at 
> least partly – provide *valuable* and *important* feedback.

"nay-sayers" as in people who oppose the adoption of systemd because
they think that some alternative is less flawed tend to have no clue.
You're more likely to get *valid* criticism from people who are at least
competent enough to recognize that whatever problems systemd has, the
alternatives are worse.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

2014-08-15 Thread Uoti Urpala
On Fri, 2014-08-15 at 22:22 +0400, Andrei Borzenkov wrote:
> В Fri, 15 Aug 2014 20:25:57 +0300
> Uoti Urpala  пишет:
> > The problem with this is that it's common for things updating
> > configuration to be separate from things using the daemon. If something
> > changes, the configuration update part wants to guarantee that
> > subsequent requests, *if any*, use the new configuration, but does not
> > itself make any such requests; as such, blocking for the service to be
> > up only causes unnecessary delays and sometimes deadlocks. Ensuring that
> > the service is up belongs to different code paths that actually make
> > requests to the daemon. And they do that whether there's been a reload
> > or not, so they need to handle it regardless of reload behavior anyway.
> > 
> 
> It's not how I interpret "reload" and how "reload" was traditionally
> implemented by initscripts. "reload" means - request daemon to do
> whatever is necessary to start using new configuration. It never
> implied changing this configuration. This happens outside of scope of
> performing "reload" action. You seem to interpret "reload" as request to
> update static on-disk configuration of service. Am I right?

No, I didn't say anything about "systemctl reload" itself modifying
on-disk configuration (if that's really what your "request to update
static on-disk-configuration" meant).

The basic difference in desired semantics seems to be:

me: "reload" should ensure that system has switched to the new
configuration. No other semantics, just that any configuration that is
used after "reload" has returned is the new one.

Lennart: "reload" should ensure the system has switched to the new
configuration, *and* should also wait to ensure that the daemon is up
and is currently responding to requests with the new configuration if
possible.


The latter semantics cause problems for any generic state change code
which writes new configuration for a service, runs "systemctl reload",
and then informs the caller that state was successfully changed.
Changing configuration does not imply that you want the daemon to be
ready to handle requests!

In case this is still not clear, consider this division of code:

1) Event hook which runs in response to some external changes or admin
requests. Writes new configuration for the daemon, and then runs
"systemctl reload foo.service". Does not use --no-block, because it
should be guaranteed that the new configuration is in effect before the
hook returns. Does not itself make any requests to the daemon.

2) Code elsewhere that actually makes requests to the daemon.

Code 1 can run early during the boot before the service itself starts.
If "reload" blocks until the queued start of the daemon is executed,
this causes a deadlock: the hook waits for the daemon to start, but boot
can not progress to the point where the daemon starts because the part
running the hook is blocked in systemctl.

Having reload block until a starting service is really be up does not
have any positive effect: code 2 has to depend on other ways ensure that
the daemon is up before making requests anyway, because it can not
assume that the reload hook has necessarily been triggered at any prior
point (and even if it could make such an assumption, relying on that
would seem like quite a hacky design - there are much better ways to
ensure daemons you require are up).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

2014-08-15 Thread Uoti Urpala
On Fri, 2014-08-15 at 12:50 +0200, Lennart Poettering wrote:
> On Fri, 15.08.14 05:09, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> 
> > > > Before this commit systemctl reload on an inactive unit with a queued
> > > > start job would block until the unit had started if the unit supported
> > > > reload, but return failure immediately if the unit didn't.
> > > 
> > > This sounds like correct behaviour to me.
> > > 
> > > 1) I think the rule should be to return only after the unit is in the
> > > desired state, and the desired operation or something equivalent has
> > > been executed. This is useful, so that callers can rely on the fact that
> > 
> > What is your "desired state" for reload then?  
> 
> *operating* with the new configuration loaded.

The problem with this is that it's common for things updating
configuration to be separate from things using the daemon. If something
changes, the configuration update part wants to guarantee that
subsequent requests, *if any*, use the new configuration, but does not
itself make any such requests; as such, blocking for the service to be
up only causes unnecessary delays and sometimes deadlocks. Ensuring that
the service is up belongs to different code paths that actually make
requests to the daemon. And they do that whether there's been a reload
or not, so they need to handle it regardless of reload behavior anyway.



> > > > Additionally systemctl reload-or-try-restart (and systemctl 
> > > > force-reload)
> > > > would block until the unit had started if the unit supported reload, but
> > > > return *success* immediately if the unit didn't.
> > > 
> > > This actually also sounds like correct behaviour to me.
> > > 
> > > "try-restart" means that we try to restart a service, but only if it is
> > > already running. If we are not running this is not considered an
> > > error. If you then add the "reload-or-" to the mix, this means that we
> > > will try to reload it if the unit supports it, instead of restarting it.
> > > 
> > > So, if a start is already queued, we'll wait for it to finish, so that
> > > the configuration is fully in effect afterwards. But if the thing is not
> > > running at all, then we will not consider that a problem at all and
> > > return success.
> > 
> > I don't think this makes sense. Reload is a "softer" alternative; if I
> > say "reload-or-try-restart", that should be considered as asking less
> > from systemd - full restart is not necessary, just reload is enough if
> > that option is available. Asking for less should not be a reason to
> > block longer!
> 
> "try-restart" also waits for any already queued start job to finish, if
> there's any, before it returns, hence "reload-or-try-restart" and
> "try-restart" actually block for the same time... Not following here...

"try-restart" does *not* wait for a queued start job to finish.
job_type_collapse() changes JOB_TRY_RESTART to JOB_NOP if
UNIT_IS_INACTIVE_OR_DEACTIVATING(), which is true if the job is just
queued and not yet being executed.



> > After an operation that changes configuration,
> > "reload-or-try-restart" (or just "reload" if we know reload is in fact
> > supported by the daemon and sufficient for the configuration change in
> > question) should guarantee that further requests use the new
> > configuration. It should not unnecessarily block until the daemon is
> > actually running if it's being started; anything which relies on it
> > being up should test for that separately, independently of configuration
> > change handling. And there's no reason why implementing lighter-weight
> > reloads to replace full restarts should lead to systemd blocking
> > longer.
> 
> No. It *should* wait until any start request that is already queued is
> complete... Again, it's about the *outcome*, and about being *positive*,
> really. Nothing else. We should not make things fail, if we already
> *know* that there's alerady something going on that makes the thing we
> are trying to do succeed.

>From the configuration change point of view, the outcome *is* positive
as soon as it's guaranteed that no further requests will be processed
with the old configuration. So systemd should return *success* at that
point. This does not "make things fail"; rather, it prevents failure
from deadlocks caused by unnecessary blocking.


> I think most of the confusion here comes from the fact that sysv service
> restarts d

Re: [systemd-devel] [PATCH] core: collapse JOB_RELOAD on an inactive unit into JOB_NOP

2014-08-14 Thread Uoti Urpala
On Fri, 2014-08-15 at 01:59 +0200, Lennart Poettering wrote:
> On Wed, 16.07.14 04:15, Jon Severinsson (j...@severinsson.net) wrote:
> > Before this commit systemctl reload on an inactive unit with a queued
> > start job would block until the unit had started if the unit supported
> > reload, but return failure immediately if the unit didn't.
> 
> This sounds like correct behaviour to me.
> 
> 1) I think the rule should be to return only after the unit is in the
> desired state, and the desired operation or something equivalent has
> been executed. This is useful, so that callers can rely on the fact that

What is your "desired state" for reload then? To me the obvious natural
meaning is that after reload has completed, it is guaranteed the service
will not process any requests with outdated configuration from before
reload was issued. Do you want to have reload imply that AND "service is
currently running"? If so, I think such a combination is better done
with "systemctl reload && systemctl start"; getting the correct
semantics for the reload part alone is harder if systemctl only
implements the combined operation.


> > Additionally systemctl reload-or-try-restart (and systemctl force-reload)
> > would block until the unit had started if the unit supported reload, but
> > return *success* immediately if the unit didn't.
> 
> This actually also sounds like correct behaviour to me.
> 
> "try-restart" means that we try to restart a service, but only if it is
> already running. If we are not running this is not considered an
> error. If you then add the "reload-or-" to the mix, this means that we
> will try to reload it if the unit supports it, instead of restarting it.
> 
> So, if a start is already queued, we'll wait for it to finish, so that
> the configuration is fully in effect afterwards. But if the thing is not
> running at all, then we will not consider that a problem at all and
> return success.

I don't think this makes sense. Reload is a "softer" alternative; if I
say "reload-or-try-restart", that should be considered as asking less
from systemd - full restart is not necessary, just reload is enough if
that option is available. Asking for less should not be a reason to
block longer!


> This is really what you want actually, because this allows admins (and
> scripts) to change some daemon configuration, and then finish off by
> issuing "reload-or-try-restart" on the daemon, so that the changes take
> effect, and then immediately talk to the daemon, knowing that if it was
> running, then the new configuration is definitely in effect. If it
> wasn't, then of course we can't talk to the service, but that's OK of
> course.

Exactly, this is a common usecase. But note that this case supports the
changed semantics of the patch, and does not support your view!

After an operation that changes configuration,
"reload-or-try-restart" (or just "reload" if we know reload is in fact
supported by the daemon and sufficient for the configuration change in
question) should guarantee that further requests use the new
configuration. It should not unnecessarily block until the daemon is
actually running if it's being started; anything which relies on it
being up should test for that separately, independently of configuration
change handling. And there's no reason why implementing lighter-weight
reloads to replace full restarts should lead to systemd blocking longer.


> > Finaly reload on an inactive unit without a queued start job would
> > reurn failure, but try-restart on the same job would return success.
> 
> Well, the "try-" prefix is supposed to be the thing "apply only if
> already running, but don't consider it an error if it isn't". We could of
> course add "try-reload" to the mix, so that you have a "reload" version
> that will not fail if the daemon is not running, but I think that's not
> particularly interested

I think such "try-reload" semantics are the actually useful use-case,
and what just "reload" should do as the non-try semantics do not seem
that useful.


> > This behaviour is unintuitive and inconsistent, so make reload of inactive
> > units behave just like try-restart.
> 
> Well, it might appear surprising in some ways, but there's a lot of
> sense in it, and I think what matters is really to look at the effect of
> calls, and make guarantees about that.

try-restart: guarantee that no old daemon instance is running at all
reload: guarantee that the daemon is not running with old configuration,
at least to the degree its ExecReload functionality supports


> > Also fixes deadlocks on boot if a unit calls systemctl reload,
> > reload-or-try-restart or force-reload on a unit ordered later
> > in the boot sequence during startup.
> 
> Well, for cases like that, please consider using --no-block, which will
> just queue the job, but not wait for it. Or use
> --job-mode=ignore-dependencies or so, even though that's a horrid
> invention...

That's not safe - with --no-block the daemon may still

Re: [systemd-devel] [PATCH 08/10] units: make it possible to disable tmp.mount using systemctl

2014-07-16 Thread Uoti Urpala
On Wed, 2014-07-16 at 20:22 +0200, Tollef Fog Heen wrote:
> ]] Lennart Poettering 
> 
> > (Also I see little point in /tmp not being a tmpfs anyway. If you want a
> > lot of space there, then use swap -- of which you can have up to 2G even
> > on 32bit systems. tmpfs on on swap has the great benefit that it
> > relieves the kernel from always having to utimately flush things to disk)
> 
> Swap doesn't scale well, though.  To the point where if the amount of
> swapped-out data is > 2x physical memory, kswapd starts gobbling CPU.
> 
> Yes, that's a bug that should be fixed, but it's been that way for years
> in Linux.

At least when I tested things a few years ago, tmpfs+swap seemed to have
a more significant performance problem than CPU use. Apparently the
kernel does not remember that the data is still on disk after it has
been read back to RAM; where a normal fs would simply drop disk-backed
data from RAM, tmpfs seems to do a new write each time. When the working
set is large, this means every read from tmpfs requires an equally big
write later.

I tested something like writing a file 2x RAM size to tmpfs and reading
it back several times. On a normal filesystem it's written to disk once
and then read 10 times. With tmpfs the reads generate both read and
write IO every time, and it's a lot slower.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH 2/2] compress: add benchmark-style test

2014-07-07 Thread Uoti Urpala
On Sat, 2014-07-05 at 20:56 +0200, Zbigniew Jędrzejewski-Szmek wrote:
> This is useful to test the behaviour of the compressor for various buffer
> sizes.
> 
> Time is limited to a minute per compression, since otherwise, when LZ4
> takes more than a second which is necessary to reduce the noise, XZ
> takes more than 10 minutes.
> 
> % build/test-compress-benchmark (without time limit)
> XZ: compressed & decompressed 2535300963 bytes in 794.57s (3.04MiB/s), mean 
> compresion 99.95%, skipped 3570 bytes
> LZ4: compressed & decompressed 2535303543 bytes in 1.56s (1550.07MiB/s), mean 
> compresion 99.60%, skipped 990 bytes

Like your earlier comparison, this compares the wrong thing. If
compression speed matters more than best compression ratio, you
shouldn't use the default settings for xz. If you want to compare with
LZ4, this benchmark should at least compare the equivalent of "xz -0".


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-fsck-root semantics

2014-07-03 Thread Uoti Urpala
On Thu, 2014-07-03 at 12:00 -0600, Chris Murphy wrote:
> What about a new fs_passno value of -1 that means "use default for this file 
> system type", and systemd spawns fsck based on the recommendation of that 
> file system's devs?

How should the file system devs communicate their current
recommendation? If your answer is "they tell systemd developers, who set
that in systemd source, and then systemd is shipped to users", that
seems less than optimal. The "don't ship fsck.foo" or "symlink fsck.foo
to /bin/true" methods seem much better from the point of view that the
default is communicated with the filesystem tools.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Multiple template parameters for one service

2014-06-30 Thread Uoti Urpala
On Mon, 2014-06-30 at 12:38 +0200, Lennart Poettering wrote:
> On Sat, 28.06.14 18:15, Moviuro (movi...@gmail.com) wrote:
> > An idea would be to use units with many '@' or have systemd interpret the 
> > string between '@' and '.service' as '@'-separated values (e.g. 
> > unison@local_user@profile.service).

> Hummm... So far the instancing was strictly one-dimensional from
> systemd's PoV. And I think I would prefer it like that, since it makes
> so many things easier. I mean, as you notice, one can always parse this
> from shell or so, if you want, so we can actually get away with not
> supporting anything more complex with systemd.

Shouldn't just another '%x' format specifier or two for unit files be
enough to get most of the benefit, without changing any of the
underlying architecture? As in something like "%?5?" meaning "interpret
the instance name as a whatever-delimited list, and place the 5th
element of the list here".


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [systemd-commits] 14 commits - configure.ac Makefile.am Makefile-man.am man/coredump.conf.xml src/core src/journal src/shared

2014-06-27 Thread Uoti Urpala
On Fri, 2014-06-27 at 16:17 +0200, Zbigniew Jędrzejewski-Szmek wrote:
> Hm, I did some testing, and I'm not convinced that XZ is the right
> compressor for the job.
> 
> First I generated a 1GB coredump of Python with random patterns. It
> takes 20 minutes (!)  to compress with XZ 9, and 11.5 min with XZ 6,
> ~1 min with gzip 6, the same with gzip 9. The gain from XZ compression
> is an increase in compression: gzip saves 7%, XZ saves 12%.
> 
> Second I generated a second 1GB coredump, highly compressible.
> XZ 9 → 99.8%, 120 s; XZ 6 → 99.8%, 120 s;
> gzip 6 → 99.6%, 11 s; gzip 9 → 99.6%, 13 s;
> 
> So the tradeoffs seem all wrong.

I think this is a bad comparison.

You tested mostly random content (which no compressor will compress) and
trivially compressible content (which anything will highly compress);
neither allows meaningfully comparing compression ratios. If you want to
quote compression ratios, you need more realistic content.

You didn't test XZ levels lower than 6, even though you tested gzip
levels lower than 9. XZ level 6 is a much higher than gzip level 9. If
you want higher speed, you should test with lower XZ levels.

For time/compression ratio tradeoff with large files, lrzip could be the
best. That probably doesn't make it the best fit for systemd use though,
due to other issues such as library integration and memory use. Or at
least not the best fit to use as default; it could be a marked
improvement over other alternatives in particular cases where an admin
could configure it as an external compressor.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] core/transaction: fix cycle break attempts outside transaction

2014-06-23 Thread Uoti Urpala
The attached patch fixes some incorrect-looking code in transaction.c.
It could fix cases where Debian users with bad package configurations
had systemd go into an infinite loop printing messages about breaking an
ordering cycle, though I have not reproduced that problem myself.

>From 6d575f18437dd5bfd02c4736dbd3e6a8a1286ab2 Mon Sep 17 00:00:00 2001
From: Uoti Urpala 
Date: Mon, 23 Jun 2014 08:14:22 +0300
Subject: [PATCH] core/transaction: fix cycle break attempts outside
 transaction

transaction_verify_order_one() considers jobs/units outside current
transaction when checking whether ordering dependencies cause cycles.
It would also incorrectly try to break cycles at these jobs; this
can not work, as the break action is to remove the job from the
transaction, which is a no-op if the job isn't part of the transaction
to begin with. The unit_matters_to_anchor() test also looks like it
would not work correctly for non-transaction jobs. Add a check to
verify that the unit is part of the transaction before considering a
job a candidate for deletion.
---
 src/core/transaction.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/core/transaction.c b/src/core/transaction.c
index d23a45c..805d40a 100644
--- a/src/core/transaction.c
+++ b/src/core/transaction.c
@@ -381,7 +381,7 @@ static int transaction_verify_order_one(Transaction *tr, Job *j, Job *from, unsi
   "Found dependency on %s/%s",
   k->unit->id, job_type_to_string(k->type));
 
-if (!delete &&
+if (!delete && hashmap_get(tr->jobs, k->unit) &&
 !unit_matters_to_anchor(k->unit, k)) {
 /* Ok, we can drop this one, so let's
  * do so. */
-- 
1.7.6.561.g3822

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Start-up resource and prioritization control

2014-05-22 Thread Uoti Urpala
On Thu, 2014-05-22 at 08:40 +0200, Umut Tezduyar Lindskog wrote:
> On Wed, May 21, 2014 at 10:44 PM, Uoti Urpala  wrote:
> > On Tue, 2014-05-20 at 15:16 +0200, Umut Tezduyar Lindskog wrote:
> >> This is exactly what the cpu.shares cgroup property does and that is
> >> what the patch posted on ML is trying to utilize.
> >
> > I don't think they are the same. If I understood correctly, the patch
> > was about setting priorities. Latency is a different thing.
> 
> Tom has explained what he meant with "latency" as large time slice
> (tom: i.e., give each of them a relatively large timeslice). Giving
> large time slice is exactly what the mentioned patch is doing.
> Indirectly, changing the priority of some services.

No, they are not the same. One is about picking the running task among a
smaller set of services. The other is about changing the running task
less often.


> > Your problem was context switch overhead. That can be fixed by making
> > the kernel switch tasks less often - even if there are 100 runnable
> > tasks, if the kernel keeps running each task for a second before
> > switching to the next one, context switch overhead will not be large.
> 
> Investigating various tick interrupt frequency is another option but I
> rather not go that route before exhausting other possibilities since
> it will affect the entire system, especially RT jobs.

Tick interrupt is a different thing - context switches are not timed
with anything like "switch at each tick" IIRC. Changing the default time
slice length need not affect realtime tasks; as soon as a high-priority
RT task becomes runnable, the scheduler can still immediately switch to
that, no matter how long the default time slices are. Also, if changing
scheduler parameters systemwide does cause some other problem, then you
could change them back after startup has finished (instead of changing
task priorities).


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Start-up resource and prioritization control

2014-05-21 Thread Uoti Urpala
On Tue, 2014-05-20 at 15:16 +0200, Umut Tezduyar Lindskog wrote:
> On Tue, May 20, 2014 at 1:46 PM, Tom Gundersen  wrote:
> > Wouldn't this be solved by telling the kernel to schedule the starting
> > services with high latency (or whatever the terminology is), i.e.,
> > give each of them a relatively large timeslice. That would decrease
> > the flushing, but at the same time avoid any issues with deadlocks
> > etc. It should also give us the flexibility to give some services low
> > latency if that is required for them etc (think udev/systemd/dbus and
> > otherthings which would otherwise block boot).
> 
> This is exactly what the cpu.shares cgroup property does and that is
> what the patch posted on ML is trying to utilize.

I don't think they are the same. If I understood correctly, the patch
was about setting priorities. Latency is a different thing.

Your problem was context switch overhead. That can be fixed by making
the kernel switch tasks less often - even if there are 100 runnable
tasks, if the kernel keeps running each task for a second before
switching to the next one, context switch overhead will not be large.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] service: don't create extra cgroup for control process when reloading SysV service

2014-03-12 Thread Uoti Urpala
On Wed, 2014-03-12 at 16:51 +0100, Lennart Poettering wrote:
> On Mon, 10.03.14 15:25, Lukas Nykryn (lnyk...@redhat.com) wrote:
> 
> > Unfortunately common practice in initscripts is to have reload as an
> > alias for restart (https://fedoraproject.org/wiki/Packaging:SysVInitScript).
> > In that case the newly started process will be killed immediately after
> > the reload process ends and its cgroup is destroyed.


> I am not sure I grok why this all would be a problem at all, given that
> on Fedora/RHEL we redirect those verbs to systemctl anyway, and
> systemctl handles reload/restart on its own anyway... What am I missing?

But systemctl supports using the reload functionality in init scripts,
so that doesn't really make a difference. As I understood the problem
description, this is what happens: someone runs "systemctl reload
foo.service" for a broken sysv script, systemd sees that the script
seems to support a "reload" argument and runs "/etc/init.d/foo reload"
in a temporary cgroup, but the broken script stops the running service
and starts a new one in the temporary cgroup.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] logs-show: fix corrupt output with empty messages

2014-02-27 Thread Uoti Urpala
On Thu, 2014-02-27 at 06:47 +0100, Zbigniew Jędrzejewski-Szmek wrote:
> Applied, though I changed the "fix" to simply print a newline.
> Just seems nicer this way.

OK, though I would have added a "return false;" if doing it that way -
now it somewhat non-obviously depends on the code below being a no-op in
this case.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] logs-show: fix corrupt output with empty messages

2014-02-26 Thread Uoti Urpala
On Thu, 2014-02-20 at 03:00 +0200, Uoti Urpala wrote:
> If a message had zero length, journalctl would print no newline, and

Ping. There's been no reply to the above message, but it's a simple
change and I don't think the bug is one people would want to leave
unfixed.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [HEADS-UP] It's release time!

2014-02-20 Thread Uoti Urpala
On Thu, 2014-02-20 at 02:03 +0100, Lennart Poettering wrote:
> On Thu, 20.02.14 01:21, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > Even if there can be reasonable style disagreements about exactly where
> > to use mixed declarations, at least some uses of them are certainly
> > beneficial. It's only a matter of getting used to reading them if you've
> > only read old-style code before. I'm sure that if C had had mixed
> > declarations from the beginning, nobody would come up with a coding
> > style which declared that particular feature to be harmful.
> > 
> > Given systemd's approach to features, I think it's pretty ironic if its
> > coding style has a "you can't expect me to get used to new features"
> > attitude to something that's been used for more than a decade.
> 
> Oh, it's really not like that. We make use of a lot of newer language
> features all the time. We have have a lot of gccisms in our code, such
> as the gcc cleanup attribute. And there's already C11 bits in the code,
> too.

I know that some other new features are used. However, I don't believe
that the underlying reason behind opposing mixed declarations would be
anything other than being used to lack of it and opposing change.

>  However, there are certain language features that we consider
> obvious improvements and there are others where we are a lot more
> conservative.
> 
> It's a matter of taste I figure, it's like tabs vs. spaces. We don't
> allow tabs either in our sources... And neither do we allow declaration
> after statements...

For indentation style, you have to pick _something_ anyway. But you
don't have to randomly forbid some normal language features, and the
only reason for people to have such a "taste" is being used to old-style
sources. There is no reason why people would pick out mixed declarations
in particular as something to oppose if it was not a newer feature.

If C had had mixed declarations from the beginning, but not the "->"
operator, we might have people who are fine mixed declarations but
insist that people write (*p).x instead of p->x. Nobody has such a taste
now when they haven't become familiar with sources using only such
style.

> We are apparently not alone on this btw, after all gcc *does* have this
> warning flag support even in C99 and C11 mode...

Yes, there are people who still want to avoid that. I think they're
quite similar to the people who insist that systemd must be only harmful
as sysvinit has worked fine for them 20+ years. That's the reason for my
comment about irony above.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] logs-show: fix corrupt output with empty messages

2014-02-19 Thread Uoti Urpala
Attached.

I also noticed a minor C correctness problem in print_multiline(): if
the last line of message has no newline, then end = message+message_len.
The next loop iteration calculates "pos = end + 1". This means that pos
points 2 past the last byte, which is not guaranteed to be a valid
address calculation for general C objects (could theoretically wrap
around to start of address space etc). Probably won't be called with
such objects in practice.

>From 4775218f59474165395f7868dfbf9e6b831f5fee Mon Sep 17 00:00:00 2001
From: Uoti Urpala 
Date: Thu, 20 Feb 2014 02:31:04 +0200
Subject: [PATCH] logs-show: fix corrupt output with empty messages

If a message had zero length, journalctl would print no newline, and
two output lines would be concatenated. Fix. The problem was
introduced in commit 31f7bf199452 ("logs-show: print multiline
messages"). Affected short and verbose output modes.

Before fix:

Feb 09 21:16:17 glyph dhclient[1323]: Feb 09 21:16:17 glyph NetworkManager[788]:  (enp4s2): DHCPv4 state changed nbi -> preinit

after:

Feb 09 21:16:17 glyph dhclient[1323]:
Feb 09 21:16:17 glyph NetworkManager[788]:  (enp4s2): DHCPv4 state changed nbi -> preinit
---
 src/shared/logs-show.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/src/shared/logs-show.c b/src/shared/logs-show.c
index 0f27c4e..fb2aeda 100644
--- a/src/shared/logs-show.c
+++ b/src/shared/logs-show.c
@@ -124,6 +124,13 @@ static bool print_multiline(FILE *f, unsigned prefix, unsigned n_columns, Output
 }
 }
 
+if (message_len == 0) {
+/* Without this, the loop below would print no '\n' for empty
+ * message. */
+message = "\n";
+message_len = 1;
+}
+
 for (pos = message;
  pos < message + message_len;
  pos = end + 1, line++) {
-- 
1.7.6.561.g3822

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [HEADS-UP] It's release time!

2014-02-19 Thread Uoti Urpala
On Wed, 2014-02-19 at 17:53 +0100, Lennart Poettering wrote:
> Zbigniew suggest we should drop -Wdeclaration-after-statement. I am not
> convinced that that would be a good idea since generally declarations
> after statements are an abomination, and we should avoid them, and it is
> nice if gcc warns about that.

Even if there can be reasonable style disagreements about exactly where
to use mixed declarations, at least some uses of them are certainly
beneficial. It's only a matter of getting used to reading them if you've
only read old-style code before. I'm sure that if C had had mixed
declarations from the beginning, nobody would come up with a coding
style which declared that particular feature to be harmful.

Given systemd's approach to features, I think it's pretty ironic if its
coding style has a "you can't expect me to get used to new features"
attitude to something that's been used for more than a decade.

BTW I looked at the CODING_STYLE file and there's a factual error:
'Processors speak "double" natively anyway' is not true for SSE math,
which is normally used for all math operations on AMD64. SSE has
separate operations for floats and doubles, and doubles can be slower.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [RFC PATCH 1/2] Replace mkostemp+unlink with open(O_TMPFILE)

2014-01-27 Thread Uoti Urpala
On Mon, 2014-01-27 at 19:53 +0100, Kay Sievers wrote:
> >> >> Can we expect open(O_TMPFILE) to fail on kernels which do not support 
> >> >> it?

> Yeah, but what happens for a "/tmp/does-not-exist" request? :)

#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)

the O_DIRECTORY part should make it fail.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel