Re: [systemd-devel] [PATCH V1] Add L3 cache allocation settings in systemd

2018-06-29 Thread systemd github import bot
Patchset imported to github.
To create a pull request, one of the main developers has to initiate one via:


--
Generated by https://github.com/haraldh/mail2git
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH V1] Add L3 cache allocation settings in systemd

2018-06-29 Thread Zhangyanfei (YF)
Hello,

I am sorry I can only send the patches by using email because of some
security reasons and the limit of my workspace.


From 962eeb1869fb033d04074df0f992a6588e97164e Mon Sep 17 00:00:00 2001
From: Yanfei Zhang 
Date: Fri, 8 Jun 2018 03:00:53 -0400
Subject: [PATCH] Add L3 cache allocation settings in systemd

The patch tries to add L3 cache allocation settings in systemd.
L3 cache allocation control is supported by new intel cpu and
is exposed by a new filesystem named resctrl. For detail information,
please refer to https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt.

The patch adds the following things:
1. Mount resctrl when systemd starts.
2. Add two config items for L3 cache allocation control
   - L3CacheAllocationSize=XX
 L3CacheAllocationSize is used to indicates how many caches
 the group can use at most, but note the value will be round
 up to the N * min_granularity. For example, if L3 cache size
 is 30M and can be divided into 10 ways, so each way is 3M.
 L3CacheAllocationSize=10M means the group will have 4 ways
 (4 * 3M > 10M) of the cache.

   - L3CacheAllocationIds=XX
 L3CacheAllocationIds controls the cache id for this group.
 If we have two cpu sockets and we will have two L3 caches.
 We can use this setting to indicate which socket cache we
 want to control.
3. Create new directory in resctrl and initialize the schemata
   when systemd detects the above two configs in a unit. The directory
   name is the same as the unit.
4. The settings can be applied to all units just like cgroup setting.

Signed-off-by: Yanfei Zhang 
---
 src/basic/exit-status.h   |   1 +
 src/basic/parse-util.c|  22 +
 src/basic/parse-util.h|   6 +
 src/core/dbus-resctrl.c   |  70 +++
 src/core/dbus-resctrl.h   |  25 +
 src/core/dbus-slice.c |   6 +
 src/core/execute.c|  11 +
 src/core/load-fragment-gperf.gperf.m4 |  10 +
 src/core/load-fragment.c  |  43 ++
 src/core/load-fragment.h  |   1 +
 src/core/main.c   |   1 +
 src/core/meson.build  |   4 +
 src/core/mount-setup.c|   2 +
 src/core/mount.c  |   2 +
 src/core/mount.h  |   1 +
 src/core/resctrl.c| 743 ++
 src/core/resctrl.h|  90 
 src/core/scope.c  |   2 +
 src/core/scope.h  |   1 +
 src/core/service.c|   6 +-
 src/core/service.h|   1 +
 src/core/slice.c  |   3 +
 src/core/slice.h  |   1 +
 src/core/socket.c |   2 +
 src/core/socket.h |   1 +
 src/core/swap.c   |   2 +
 src/core/swap.h   |   1 +
 src/core/unit.c   |  25 +
 src/core/unit.h   |   6 +
 29 files changed, 1088 insertions(+), 1 deletion(-)
 create mode 100644 src/core/dbus-resctrl.c
 create mode 100644 src/core/dbus-resctrl.h
 create mode 100644 src/core/resctrl.c
 create mode 100644 src/core/resctrl.h

diff --git a/src/basic/exit-status.h b/src/basic/exit-status.h
index c41e8b8..fbd3fee 100644
--- a/src/basic/exit-status.h
+++ b/src/basic/exit-status.h
@@ -69,6 +69,7 @@ enum {
 EXIT_CACHE_DIRECTORY,
 EXIT_LOGS_DIRECTORY, /* 240 */
 EXIT_CONFIGURATION_DIRECTORY,
+EXIT_RESCTRL_WRITE_PID,
 };
 
 typedef enum ExitStatusLevel {
diff --git a/src/basic/parse-util.c b/src/basic/parse-util.c
index 6becf85..dd23ca3 100644
--- a/src/basic/parse-util.c
+++ b/src/basic/parse-util.c
@@ -453,6 +453,28 @@ int safe_atollu(const char *s, long long unsigned 
*ret_llu) {
 return 0;
 }
 
+int safe_atollx(const char *s, long long unsigned *ret_llx) {
+char *x = NULL;
+unsigned long long l;
+
+assert(s);
+assert(ret_llx);
+
+s += strspn(s, WHITESPACE);
+
+errno = 0;
+l = strtoull(s, , 16);
+if (errno > 0)
+return -errno;
+if (!x || x == s || *x != 0)
+return -EINVAL;
+if (*s == '-')
+return -ERANGE;
+
+*ret_llx = l;
+return 0;
+}
+
 int safe_atolli(const char *s, long long int *ret_lli) {
 char *x = NULL;
 long long l;
diff --git a/src/basic/parse-util.h b/src/basic/parse-util.h
index f3267f4..8ebed40 100644
--- a/src/basic/parse-util.h
+++ b/src/basic/parse-util.h
@@ -34,6 +34,7 @@ static inline int safe_atou(const char *s, unsigned *ret_u) {
 
 int safe_atoi(const char *s, int *ret_i);
 int safe_atollu(const char *s, unsigned long long *ret_u);
+int safe_atollx(const char *s, unsigned long long *ret_x);
 int safe_atolli(const char *s, long long int *ret_i);
 
 int safe_atou8(const char *s, uint8_t *ret);
@@ -65,6 +66,11 @@ static 

Re: [systemd-devel] Waiting on a bus name to appear in systemd service

2018-06-29 Thread Ryan Gonzalez

systemd can depend on services, not bus names. In your example, you'd want:

Wants=polkit

However, in most cases, you don't actually want to do this; if the service 
(in this case, polkit) tells systemd what bus name it is going to ask for, 
systemd will automatically wait when your service asks for it.


For instance, the moment your service tries to connect to 
org.freedesktop.PolicyKit1, systemd will wait for polkit to start before 
letting your service continue.



On June 29, 2018 4:44:27 PM Federico Di Pierro  wrote:


Hi everyone!

I was wondering whether there was a way for a systemd service to wait for a
bus name to appear before starting a service.
Something like:

Requires=org.freedesktop.PolicyKit1

I could not find much googling around.
Is this possible?
Thanks everyone!
Federico



--
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Waiting on a bus name to appear in systemd service

2018-06-29 Thread Federico Di Pierro
Hi everyone!

I was wondering whether there was a way for a systemd service to wait for a
bus name to appear before starting a service.
Something like:

Requires=org.freedesktop.PolicyKit1

I could not find much googling around.
Is this possible?
Thanks everyone!
Federico
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn: starting multiple shells

2018-06-29 Thread Nikolaus Rath
On Jun 25 2018, Lennart Poettering  wrote:
> On Sa, 23.06.18 21:57, Nikolaus Rath (nikol...@rath.org) wrote:
>
>> On Jun 23 2018, Nikolaus Rath  wrote:
>> > On Jun 23 2018, aleivag  wrote:
>> >> short answer, yes, `machinectl login` is only suppported for systemd-init 
>> >> ,
>> >> and `machinectl shell` `systemd-run` will try to talk to the container via
>> >> dbus, so i dont think you are force to have systemd runing inside the
>> >> container (i may be wrong) but you do need to have dbus (and its easy to
>> >> just have systemd).  if you dont need it, you can always use nsenter to
>> >> access a namespace on your machine
>> >
>> > Still not working:
>> [..]
>> > $ sudo machinectl shell root@iofabric
>> > [sudo] password for nikratio: 
>> > Failed to get shell PTY: Cannot set property
>> > StandardInputFileDescriptor, or unknown property.
>> 
>> So this seems to be caused by systemd in the container being too old,
>> and is therefore not available here.
>> 
>> The 'nsenter' approach seems to work so far, but I don't see a generally
>> applicable way to figure out the right PID. Is there a trick for
>> that?
>
> machinectl show --value $MACHINE -p Leader

Still not quite working, now there seems to be a problem with
/proc/self/fd in the new shell:

$ sudo systemd-nspawn -M $MACHINE \
 --private-users=1379532800:65536 --private-network \
 --as-pid2

# Other terminal

$ pid=$(machinectl show --value $MACHINE -p Leader 2> /dev/null)
$ sudo nsenter -t ${pid} --mount --uts --ipc --net --pid --cgroup --user
[root@iofabric /]# echo $UID
0
[root@iofabric /]# echo 1 > /dev/stderr 
-bash: /dev/stderr: Permission denied
[root@iofabric /]# ll /dev/stderr
lrwxrwxrwx 1 root root 15 Jun 29 21:13 /dev/stderr -> /proc/self/fd/2
[root@iofabric /]# ll /proc/self/fd/2
lrwx-- 1 root root 64 Jun 29 21:22 /proc/self/fd/2 -> /dev/pts/0
[root@iofabric /]# ll /dev/pts/
total 0
crw-rw-rw- 1 root root 5, 2 Jun 29 21:13 ptmx
[root@iofabric /]# 


What's happening here?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-29 Thread Ignaz Forster

Reordered the quotes below for better reading flow.

Am 28.06.2018 um 10:52 schrieb Lennart Poettering:

But quite frankly I don't grok the problem at hand, i.e. what you are
trying to do, even.


Was this explanation any better?


Not really still, what I don't grok what precisely a "system snapshot"
in suse terms is actually supposed to entail. Is it supposed to
contain only the vendor RPMs, i.e. only /usr?


That's the general idea, yes.*

Everything which contains variable or user data (i.e. which is not 
supposed to be rolled back like databases or files created by the user) 
will be put onto an own subvolume or partition.


For reference here's how this looks like on openSUSE Leap 15 again:
ID parent top lvl path
-- -- --- 
2575  5   /@
258257257 /@/var
259257257 /@/usr/local
260257257 /@/tmp
261257257 /@/srv
262257257 /@/root
263257257 /@/opt
264257257 /@/home
265257257 /@/boot/grub2/x86_64-efi
266257257 /@/boot/grub2/i386-pc
267257257 /@/.snapshots
411267267 /@/.snapshots/138/snapshot
412267267 /@/.snapshots/139/snapshot


*) Some packages will still use /bin, /lib and the like, and those will 
be part of the snapshot; on the other hand distribution RPMs may also 
contain files or directories in e.g. /var, which will not be part of the 
snapshot. Because of that I'd prefer the term "static / read-only / 
unmodifiable part of the root file system" instead of "vendor RPMs".



or everything except
/home, /srv, /var, /tmp?


Everything except the directories listed above, because those contain 
variable data which one usually doesn't want to reset just because e.g. 
a new kernel doesn't boot.
That won't prevent the user from creating his own snapshots of these 
subvolumes of course.



systemd will never create disassociated subvolumes for you.


That's the problem - it will create subvolumes which will just disappear
from the system when switching to the next snapshot.


Well, no, if snapshots are done recursively they wouldn't, they would
be switched at the same time.


I think it's not relevant for this discussion, you were repeatedly 
talking about recursive snapshots now, however as far as I'm aware btrfs 
is not capable to doing that. I've found a patchset on 
https://www.spinics.net/lists/linux-btrfs/msg29205.html, but it seems 
the relevant parts for snapshot creation weren't added upstream.


So how are those recursive btrfs snapshots supposed to work?


tmpfiles won't create any subvolumes for you — except if they are
missing. tmpfiles can't guess the complex mappings you applied to your
tree, it can't know that you don't want to allow recursive snapshots,
but place them all in the same dir and bind mount them. Also, if I
understand correctly the way suse sets this up always *requires*
additions to fstab for any subvol created, which is clearly out of
focus for tmpfiles.


I agree that it's next to impossible to programmatically find out what a 
user intended to do with a specific layout.
However in my opinion it would be preferable to create at least a 
working, though maybe not optimal configuration compared to a 
configuration which is known to break in several cases (independent of 
the distribution).


Instead of adding fstab entries (which I also have a bellyache with) it 
may be an alternative to create a mount unit instead. But yes, something 
would have to be done to mount those subvolumes on boot.



Also, tmpfiles won't actually create any subvols below /usr (unless a
user dropped something in to do that on its own), it will only do so
in the root dir for precisely /var, /tmp, /home and /srv. All others
are created below /var. Which means you rule of "don't create subvols
below system directories" isn't actually touched, because the
read-only OS is monopolized in /usr anyway... Or maybe I am still not
getting what you are trying to say?


The rule would be "don't create subvols below snapshots", and the 
read-only OS is not exactly monopolized in /usr either (not only because 
of /bin, /lib etc, but also because of /boot - see last paragraph of the 
mail), but apart from that that nails it.


The issue was originally discovered when upgrading systemd on an older 
openSUSE machine which did not have a unified /var subvolume, so 
/var/lib/machines got attached to the root subvolume.
This may happen again in the future for us, but as said we are not the 
only ones using this mechanism. Seeing the default Fedora and Ubuntu 
btrfs layouts it's even more likely to happen if anybody is using 
pattern 3 there. Apart from that I'd prefer systemd-tmpfiles to work 
even if a user threw in something unexpected.


I'm wondering if just refusing to create a subvolume on a snapshot would 
be another option... That way the problem would be given back to the 
user or distribution.



The assumption systemd-tmpfiles makes