Hi again,
thank you both for your input! See comments inline:
On 6/17/19 10:20 PM, Jouke Witteveen via arch-projects wrote:
> On Mon, Jun 17, 2019 at 9:45 PM Erich Eckner via arch-projects
> wrote:
>>> In case you are not familiar with cloud-init, the idea is that you can
>>> build a single OS image that runs cloud-init on boot, and cloud-init
>>> will take care of such things as network configuration, so that the same
>>> image will work regardless of the network setup you choose for the cloud
>>> instance.
>>
>> Does cloud-init run before or after systemd? In other words: is it a
>> systemd unit of some kind or is it rather an init daemon itself which
>> chain-loads systemd?
Cloud-init comes with multiple systemd units and as such is is run by
systemd multiple times at different stages during the boot process. The
cloud-init wiki page has a rough overview:
https://wiki.archlinux.org/index.php/Cloud-init#Systemd_integration
>>> The current cloud-init implementation for Arch uses netctl [3]. The
>>> implementation is correct in such a way that it does indeed render the
>>> right netctl profile(s) and enables them. However there is a problem:
>>> they are not being started. AFAICT this is because cloud-init does this
>>> while the systemd boot is already in process, and changing the
>>> dependency graph (by adding new units) does not have any effect until
>>> the next run (everything works right on second boot). Note that I even
>>> tried having cloud-init run `systemd daemon-reload` after enabling the
>>> units, but it didn't help either.
>>
>> Did you try cloud-init to issue "systemctl start $unitname.service"
>> additionally to "systemctl enable $unitname.service"? This seems to me to
>> be the right way.
It might be worth taking another look at that, but let me quickly lay
out why I didn't try this yet: when cloud-init runs for the first time,
it goes through a bunch of plugins called "data sources", which will
probe different aspects of the environment to determine the cloud
provider it is running in, use that knowledge to retrieve
vendor-specific configuration details, and use that to write e.g.
network config, hostname, etc. The tricky part is that for example the
EC2 data source uses a magic IP to retrieve this config (see
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html#instancedata-data-retrieval),
and other data sources might do similar things. Hence, I was worried
that prematurely re-configuring the network might interfere with such
actions (the unit running this has "Before=network-pre.target").
However, if cloud-init first fetches all data and then configures things
it might not be a problem. I'll take a closer look at what is happening
there and maybe try to get a statement from the cloud-init folks.
>>> The reason I am posting this here is that this seems to be an issue due
>>> to the particular way netctl use systemd units. Since you don't know the
>>> names or the number of profiles (units) that will be generated during
>>> image creation, you cannot enable them at that time. But doing so during
>>> first boot does not seem to work.
>>
>> I would rather say it's due to the way, cloud-init uses systemd units: it
>> enables them, but that's only relevant for successive boots, so it should
>> rather enable and start them (systemd should still honor the dependencies
>> of the units and postpone the start to the point where all of the
>> dependencies are loaded, too).
>>
>>>
>>> Just for comparison, if one were to use e.g. systemd-networkd instead,
>>> you would just enable the systemd-networkd unit during image creation,
>>> cloud-init could generate the appropriate config for any number of
>>> devices, and when the unit starts it will do the right thing. Likewise
>>> on other distros, e.g. Debian with /etc/network/interfaces or such.
>>>
>>> Now, from my point of view, there could be several approaches to solve
>> this:
>>>
>>> 1. systemd supports updates of the dep graph during boot
>>> 2. support such a use case in netctl
>>> 3. change cloud-init to use systemd-networkd for Arch
>>>
>>> Let me quickly elaborate:
>>>
>>> 1. is intentionally not phrased as something to be done. It might
>>> already be a thing, I just couldn't figure out how to do it. If someone
>>> knows more about this, I would love to hear about it. If this works, it
>>> would be the easiest solution. However, if it doesn't, I don't have my
>>> hopes up high for this being added to systemd anytime soon.
>>
>> This would mean, if I "systemctl enable $some.service", it will be started
>> right away, too - probably not, what systemd devs want (at least it's
>> not, what systemd currently does).
>
> `systemctl enable --now ` starts a service in addition to enabling
> it.
Might be an option, see above.
>>> 2. is the main reason I am writing this. Things that came to mind were
>>> another special unit (netctl-all?), or even just a well-defined
>>> interface to write devices into the