Re: [systemd-devel] Health check for a service managed by systemd

2019-07-25 Thread Reindl Harald


Am 25.07.19 um 20:38 schrieb Debraj Manna:
> I have a service on a Ubuntu 16.04 which I use systemctl start, stop,
> restart and status to control.
> 
> One time the systemctl status returned active, but the application
> "behind" the service responded http code different from 200.
> 
> So I would like to restart the service when the http code is not 200.
> Can some one let me know is there a way to achieve the same via systemd? 

nope, just write a seperate service with a little curl magic and
"systemctl condrestart" and remember that you have to avoid premature
restarts just because of a little load peak
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Health check for a service managed by systemd

2019-07-25 Thread Debraj Manna
I have a service on a Ubuntu 16.04 which I use systemctl start, stop,
restart and status to control.

One time the systemctl status returned active, but the application "behind"
the service responded http code different from 200.

So I would like to restart the service when the http code is not 200. Can
some one let me know is there a way to achieve the same via systemd?
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] service fails to use the latest value of the slice

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 13:02, Tiwari, Hari Sahaya (hari-sahaya.tiw...@hpe.com) wrote:

> Hello,
> I have one query on the behaviour I am overserving with systemd service.
>
> Below are the contents of service file,
>
> # cat /usr/lib/systemd/system/qs.service
> [Unit]
> Description=init script
> After=network.target
>
> [Service]
> ExecStartPre=/bin/sh /usr/local/cmcluster/bin/realtimeslice.sh -s 
> /usr/lib/systemd/system/qs.service
> ExecStart=/usr/local/qs/bin/qs
> Type=simple
>
> [Install]
> WantedBy=multi-user.target
>
> What realtimeslice.sh does is to identify a slice having RT quantum
> (cpu.rt_runtime_us) as 95.  Once the slice is identified the
> service attaches the binary mentioned in ExecStart to that slice.
> This is done because binary(/usr/local/qs/bin/qs) tries to set
> realtime priority.

I have trouble understanding this. I think there's some misconception
about what's systemd's slice concept is about.

Also note: systemd owns the cgroup tree. If you make changes to the
cgroup tree you are on your own, you are not supposed to, except if
you asked four own delegated subtree.

This means: attaching a process systemd manages for you to a different
cgroup with your own code is not supported, you are on your own.

That said, you can let systemd create cgroups and then adjust the
attributes that systemd doesn't manager yourself, such as the RT
attributes. For that use an ExecStartPre= that just sets these
attributes, and maybe make sure With CPUWeight=100 that you actually
get properly added to the "cpu" hierarchy...

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] device enumeration by systemd user instances

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 17:51, Pawel Szewczyk (p.szewc...@samsung.com) wrote:

> On 7/23/19 18:00, Lennart Poettering wrote:
> >
> > Do you have some profiler results about this? i.e. what exactly is the
> > time spent on?
>
> I will probably try to do some 'real' study of this problem by using
> perf or other tool. So far all I know is that the device enumeration
> (i.e. device_enumerate() function in src/core/device) takes around 100ms
> on my arm devices. As I noted before, around 80 devices are
> enumerated.

80 devices sounds like a lot? On my beefy laptop here i have like 50,
how come it's so many on some ARM device? I'd expect like 10 or so...

normally it should just be a handful of block devices, ttys and
network devices. You got so many of them?

> > Hmm, so some people appear to use this, since we recently fixed a bug
> > in this area that people noticed while making us of this...
>
> Of course, someone can use it and it seems reasonable to have the
> ability for device units in user session. I would argue, however, that
> having the same set of devices processed for both 'worlds' is sub-optimal.
> Why not have separate tag in udev for 'user' devices, that would be
> enumerated by 'systemd --user'?

Hmm, we recently went the other way, and made the .device handling in
the --user instance more like the one in the system instance.

But you do have a point, it might be worth adding a more resricted
tag there (after all, unpriv userspace should never need to know
anything about block devices, as one example). But also: why is this
so slow and why are so many devices tagged for you? to keep things
simple i'd add a separate tag only as last resort and rather see slow
stuff improved...

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] device enumeration by systemd user instances

2019-07-25 Thread Pawel Szewczyk
On 7/23/19 18:00, Lennart Poettering wrote:
> 
> Do you have some profiler results about this? i.e. what exactly is the
> time spent on?

I will probably try to do some 'real' study of this problem by using 
perf or other tool. So far all I know is that the device enumeration 
(i.e. device_enumerate() function in src/core/device) takes around 100ms 
on my arm devices. As I noted before, around 80 devices are enumerated.

> 
> [...]
> 
> Hmm, so some people appear to use this, since we recently fixed a bug
> in this area that people noticed while making us of this...

Of course, someone can use it and it seems reasonable to have the 
ability for device units in user session. I would argue, however, that 
having the same set of devices processed for both 'worlds' is sub-optimal.
Why not have separate tag in udev for 'user' devices, that would be 
enumerated by 'systemd --user'?

> 
>> So here is the question: is this a safe thing to do? I wonder if trying
>> to start units without having device units created would break something
>> in systemd.
> 
> Well, the work needs to be done anyway, so what do you gain if you do
> it a bit later?

The same thing that we already get for starting units: the ability to do 
that in parallel with other things.

> 
> What specifically do you try to optimize? latency until the first
> process is started by systemd --user?

I think the main goal is starting all units in user session as fast as 
possible.


Thanks,
Paweł Szewczyk
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] device enumeration by systemd user instances

2019-07-25 Thread Greg KH
On Thu, Jul 25, 2019 at 05:29:33PM +0200, Pawel Szewczyk wrote:
> On 7/17/19 23:14, Greg KH wrote:
> > 
> > 100ms seems like a really long time, what exactly is it doing during
> > that time?  Is the kernel spending too much time regenerating the
> > uevents?
> > 
> > How many devices are you talking about?  Any chance to see what really
> > is happening here by running perf to see where the hot-spots are?
> 
> There are ~80 devices being enumerated and 1ms per device still seem a 
> little too long. Note, that we are running this on arm architecture.
> 
> Maybe using perf or other tools to find out what exactly is taking this 
> time is a good idea (never used perf to be honest). For now just by 
> adding some verbose logs it seems that processing of each device 
> actually takes around 1ms.

Run perf and see what is going on, that seems like way too long per
device.

thanks,

greg k-h
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] device enumeration by systemd user instances

2019-07-25 Thread Pawel Szewczyk
On 7/17/19 23:14, Greg KH wrote:
> 
> 100ms seems like a really long time, what exactly is it doing during
> that time?  Is the kernel spending too much time regenerating the
> uevents?
> 
> How many devices are you talking about?  Any chance to see what really
> is happening here by running perf to see where the hot-spots are?

There are ~80 devices being enumerated and 1ms per device still seem a 
little too long. Note, that we are running this on arm architecture.

Maybe using perf or other tools to find out what exactly is taking this 
time is a good idea (never used perf to be honest). For now just by 
adding some verbose logs it seems that processing of each device 
actually takes around 1ms.

Regards,
Paweł Szewczyk
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Logs from a service is not showing up in journalctl but showing up in syslog

2019-07-25 Thread Reindl Harald


Am 25.07.19 um 15:46 schrieb Debraj Manna:
> Thanks Mantas for replying. 
> 
> ExecStartPre=-/bin/su ubuntu -c
> "/home/ubuntu/build-target/kafka/kafka-systemd-prestart.sh"
> ExecStart=/bin/su ubuntu -c
> "/home/ubuntu/build-target/kafka/kafka-systemd-health.sh"
> ExecStopPost=-/bin/bash
> /home/ubuntu/build-target/kafka/kafka-systemd-poststop.sh
> 
> If I specify User= then all the scripts will be executed with that user.
> Can you let me know if I only need to execute kafka-systemd-prestart.sh
> and kafka-systemd-health.sh with let's sayubuntu user
> and kafka-systemd-poststop.sh withroot user what is the recommended way
> to do this?

to split that mess into different services with proper dependencies
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] service fails to use the latest value of the slice

2019-07-25 Thread Tiwari, Hari Sahaya
Hello,
I have one query on the behaviour I am overserving with systemd service.

Below are the contents of service file,

# cat /usr/lib/systemd/system/qs.service
[Unit]
Description=init script
After=network.target

[Service]
ExecStartPre=/bin/sh /usr/local/cmcluster/bin/realtimeslice.sh -s 
/usr/lib/systemd/system/qs.service
ExecStart=/usr/local/qs/bin/qs
Type=simple

[Install]
WantedBy=multi-user.target

What realtimeslice.sh does is to identify a slice having RT quantum 
(cpu.rt_runtime_us) as 95.
Once the slice is identified the service attaches the binary mentioned in 
ExecStart to that slice.
This is done because binary(/usr/local/qs/bin/qs) tries to set realtime 
priority.

Now coming to the issue which I am facing,


1. Suppose I have 2 slices, A.slice (cpu.rt_runtime_us value = 95) and 
B.slice (cpu.rt_runtime_us  value = 0)

2. Currently the service qs.service is attached to A.slice

3. Now I changed cpu.rt_runtime_us values for these slices:  A.slice to 0 
and B.slice to 95.

4. When I restart the service, ExecStartPre (i.e realtimeslice.sh) 
determines that B.slice is eligible slice and it updates the slice name in the 
configuration files and in drop-ins.

5. But Binary at ExecStart still doesn't see the updated slice info and 
fails to set the realtime priority.

6. If I restart the service again it is working fine and binary is able to 
set the realtime priority.

Is it that systemd at its invocation caches the all the information (including 
slice values) and uses those values for ExecStart[Pre] ? Because in the next 
run it is working fine.

Is there a way reload/refresh the settings done at ExecStartPre, so that it 
will reflect at ExecStart.

Thanks & Regards,
Hari.


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Logs from a service is not showing up in journalctl but showing up in syslog

2019-07-25 Thread Debraj Manna
Thanks Mantas for replying.

ExecStartPre=-/bin/su ubuntu -c "/home/ubuntu/build-target/
kafka/kafka-systemd-prestart.sh"
ExecStart=/bin/su ubuntu -c "/home/ubuntu/build-target/
kafka/kafka-systemd-health.sh"
ExecStopPost=-/bin/bash /home/ubuntu/build-target/
kafka/kafka-systemd-poststop.sh

If I specify User= then all the scripts will be executed with that user.
Can you let me know if I only need to execute
kafka-systemd-prestart.sh and kafka-systemd-health.sh
with let's say ubuntu user and kafka-systemd-poststop.sh with root user
what is the recommended way to do this?

On Thu, Jul 25, 2019 at 4:50 PM Mantas Mikulėnas  wrote:

> On Thu, Jul 25, 2019 at 1:26 PM Debraj Manna 
> wrote:
>
>> I have unit file which looks like below. I am seeing some of the echo are
>> showing up in syslog but not in journalctl. Can someone let me know what is
>> going on?
>> systemd version 229 running on Ubuntu 16.
>>
>> [Unit]
>> Description=Kafka Service
>>
>> [Service]
>> Type=simple
>> Environment=KAFKA_HOME=/home/ubuntu/deploy/kafka
>> Environment=LIB_DIR=/var/lib/kafka
>> Environment=LOG_DIR=/var/log/kafka
>> Environment=TEMP_DIR=/home/ubuntu/tmp
>>
>> Environment=TOOLS_JAR=/home/ubuntu/build-target/common-utils/tools-0.001-SNAPSHOT.jar
>> Environment=MIN_DATA_PARTITION_FREE_SPACE_PCT=10
>> Environment=MIN_DATA_PARTITION_FREE_SPACE_GB=10
>> Environment=DATA_PARTITION_NAME=/var
>>
>> ExecStartPre=-/bin/mkdir -p /var/log/kafka
>> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/log/kafka
>> ExecStartPre=-/bin/mkdir -p /var/lib/kafka/kafka-logs
>> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/lib/kafka/kafka-logs
>> ExecStartPre=-/bin/rm -f /var/log/kafka/kafka-logs/.lock
>> ExecStartPre=-/bin/mkdir -p /home/ubuntu/tmp
>> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /home/ubuntu/tmp
>> ExecStartPre=-/bin/chmod -R 775 /home/ubuntu/tmp
>> ExecStartPre=-/bin/su ubuntu -c
>> "/home/ubuntu/build-target/kafka/kafka-systemd-prestart.sh"
>> ExecStart=/bin/su ubuntu -c
>> "/home/ubuntu/build-target/kafka/kafka-systemd-health.sh"
>>
>> [...]
>
>> Doing sudo journalctl -u kafka.service looks like below
>>
>> Jul 25 07:41:39 platform2 systemd[1]: Started Kafka Service.
>> Jul 25 07:41:39 platform2 su[39160]: Successful su for ubuntu by root
>> Jul 25 07:41:39 platform2 su[39160]: + ??? root:ubuntu
>> Jul 25 07:41:39 platform2 su[39160]: pam_unix(su:session): session opened 
>> for user ubuntu by (uid=0)
>> Jul 25 07:41:40 platform2 bash[39192]: [Jul 25 2019 07:41:40-572] Exiting 
>> kafka...
>>
>> I am not seeing some of the echo from kafka-systemd-prestart.sh in journatl 
>> but I am seeing those logs in syslog
>>
>> Jul 25 10:17:03 platform2 su[38464]: WatchedEvent state:SyncConnected 
>> type:None path:null
>> Jul 25 10:17:03 platform2 su[38464]: Node does not exist: /brokers/ids/2
>> Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-343] partition 
>> /var free% 9 required% 10 freegb 134 required 10
>> Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-344] Sufficient 
>> disk space is not available, sleeping for 60 seconds before exiting...
>>
>>
> Take a look at `journalctl -o verbose SYSLOG_IDENTIFIER=su _PID=38464`.
>
> I suspect the messages *are* in the journal, just not tagged with
> UNIT=kafka.service anymore. In some distros, `su` is actually configured to
> call pam_systemd and set up a new systemd-logind session – when this
> happens, the process is moved out of kafka.service into a user session
> scope, and its syslog messages are grouped accordingly.
>
> Consider replacing `su` with `runuser`, or indeed with systemd's [Service]
> User= option.
>
> --
> Mantas Mikulėnas
>
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Reindl Harald


Am 25.07.19 um 13:29 schrieb Frank Steiner:
> Silvio Knizek wrote:
> 
>> the proper approach would be to define the dependency of the generated
>> .service to the mount point with a drop-in and RequiresMountsFor=. See
>> man:systemd.unit for more information.
> 
> How would that work for init.d scripts?

excatly the same way i guess - remember that your sysvinit script is in
reality a generated service as you can see in "systemctl status
sysv-servicename"

/etc/systemd/system/servicename.service.d/ dropins
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Lennart Poettering wrote:


Things like ssh when run as a sysv script: people expect that shutting
down ssh doesn't kill all your children, and it didn't in sysv... And
there are similar things.


That's a reasonable example, thanks!


--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Reindl Harald


Am 25.07.19 um 13:15 schrieb Mantas Mikulėnas:
> IIRC the idea is that user sessions are killed during shutdown before
> unmounting anything anyway, and services have implicit
> After=local-fs.target so they get stopped before local filesystems? The
> shutdown process is one area that I still know very little about.

the shutdown process is pretty simple: the exactly opposite order based
on dependencies like the startup and this without inter dependencies are
stopped in parallel as they are started in parallel at boot

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: failing unmounts during reboot

2019-07-25 Thread Reindl Harald


Am 25.07.19 um 13:07 schrieb Frank Steiner:
> Reindl Harald wrote:
> 
>> "try to kill all processes using a filesystem before unmounting it"
>> isn't that easy when it comes to namespaces, "lsof" even don't tell you
>> the root cause preventing unmount but the ernel still refuses to do so
> 
> Agreed! I've seen that already when trying to unmount manually. But it
> does help in many cases, so lsof+kill could be helpful, even if it's not
> a perfect solution. Or at least having the possibility to do sth. like
> that on my own via a drop-in.

and how do you marry that to the service and cgroups concept and
ordering? you can't because lsof+kill is a blind butcher like OOM killer
and when your stuff is running in proper ordered units such issues won't
happen because other than sysvinit systemd don't let run random leftover
processes when services and cgroups are terminated
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 13:46, Frank Steiner (fsteiner-ma...@bio.ifi.lmu.de) wrote:

> Mantas Mikulėnas wrote:
>
> > I think that's a deliberate decision made in the
> > systemd-sysv-generator. Note how the generated .service files have
> > "KillMode=process" (and even "RemainAfterExit=yes"). The default for
> > native services is to kill the entire cgroup, and IIRC that even was
> > one of the main reasons for using cgroups.
> > Most likely it's there to retain compatibility with some of the
> > weirder init.d scripts – those which don't start any daemons; those
> > which start several; and so on and so on.
>
> Thanks a lot! I found the service file in /run/systemd/generator.late.
> Anyway, I cannot think of any init.d script where sub processes or
> anything should be left running when the script is stopped, so
> KillMode=process seems strange. Maybe Lennart knows why this decision
> was taken.

Things like ssh when run as a sysv script: people expect that shutting
down ssh doesn't kill all your children, and it didn't in sysv... And
there are similar things.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Ulrich Windl wrote:


*1: I have a support call open with SUSE:
Before systemd (almost) all processes were killed before unmounting.
With systemd I'm seeing excessive reboot delays due to unmount timing out. For 
example if you have a process started from NFS that has a log file on NFS open, 
too.
It seems the order is roughly like this:
1) Shutdown the network
2) Try unmounting filesystems, including NFS
3) Kill remaining processes


I cannot confirm that, at least not for SLES/D 15. All mount units
for NFS filesystems created from fstab get "Before=remote-fs.target",
so they are shutdown before the network goes down. Check in
/run/systemd/generator to see if this entry is missing in your units.


--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Lennart Poettering wrote:


When generating native unit files from legacy sysv scripts, we use
KillMode=process, which means we'll only kill the main process, and
nothing else. This choice was made since its behaviour comes closest
to classic SysV behaviour (since there the init system didn't kill any
auxiliary processes either).

Given it's 2019 it might be wise to just write a native unit file if
you want better control of this. Note that for native unit files we
use different defaults: there we kill everything by default.

You can also reuse the generated unit, but only change the KillMode=
setting, by creating a drop-in in
/etc/systemd/system/.service.d/.conf, and then
adding there:

 [Service]
 KillMode=control-group


Yes, Mantas answer brought me to the same solution. For me that's
everything I need to solve the problems with our init.d scripts I guess,
so thanks!


But let me underline: a SysV script which leaves processes around is
simply buggy. In sysv it was expected that scripts would clean up
properly on "stop", on their own. If you don't do that in your script,
then maybe fix that...

The only correct way to shut things down is with individual ordering

between units really, but that falls apart if you stick to historic
sysv semantics too much.


Totally agreed, but the problem are scripts like the ones for grid or
matlab that have 200 lines or do such weird things that you couldn't
do anything but write a unit more or less like the one systemd-sysv-generator
creates. I just have to deal with those, but the KillMode drop-in
does help enough.

Thanks for all the explanations!

cu,
Frank
--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Mantas Mikulėnas wrote:


I think that's a deliberate decision made in the
systemd-sysv-generator. Note how the generated .service files have
"KillMode=process" (and even "RemainAfterExit=yes"). The default for
native services is to kill the entire cgroup, and IIRC that even was
one of the main reasons for using cgroups.
Most likely it's there to retain compatibility with some of the
weirder init.d scripts – those which don't start any daemons; those
which start several; and so on and so on.


Thanks a lot! I found the service file in /run/systemd/generator.late.
Anyway, I cannot think of any init.d script where sub processes or
anything should be left running when the script is stopped, so
KillMode=process seems strange. Maybe Lennart knows why this decision
was taken.

But now that I found the service file, I created a drop-in
/etc/systemd/system/bla.service.d/kill.conf with

[Service]
KillMode=control-group

and now, after reloading everything, the sleep is killed.
So for all init.d scripts that I get from somewhere I've now
way to ensure that all their processes are cleaned up by a
simple drop-in.

This will likely solve most of my problems.

Thanks for pointing me to the right direction :-)

cu,
Frank

--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 13:29, Frank Steiner (fsteiner-ma...@bio.ifi.lmu.de) wrote:

> Silvio Knizek wrote:
>
> > the proper approach would be to define the dependency of the generated
> > .service to the mount point with a drop-in and RequiresMountsFor=. See
> > man:systemd.unit for more information.
>
> How would that work for init.d scripts?
>
> > Also the systemd-sysv-generator can only do so much.
>
> 1) If the systemd-sysv-generator creates wrapper .service files on-the-fly,
>is it possible to hook into those with dropins? The man page didn't
>give much information.

Yes.

> 2) If init.d scripts are wrapped by .service files, shouldn't the
>processes spawned by these scripts be killed when shutting down?

No, for compat with sysv that's not what we do. "/etc/init.d/fooobar
stop" didn't do any such clean-up and neither do we hence.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: Re: failing unmounts during reboot

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 12:49, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> >>> Silvio Knizek  schrieb am 25.07.2019 um 12:10 in
> Nachricht :
>
> [...]
> > Hi,
> >
> > the proper approach would be to define the dependency of the generated
> > .service to the mount point with a drop-in and RequiresMountsFor=. See
> > man:systemd.unit for more information.
> > Also the systemd-sysv-generator can only do so much. Please write
> > yourself proper units if you encounter problems.
>
> What this "solution" fails to see is that any user can start a
> process that may prevent clean unmount. It's completely far away
> from reality to believe that such a user will write (or even know
> how to write) a systemd service!

We automatically kill all unpriv user programs on shutdown.

We do not kill processes from system service units that have
KillMode=process, and thus explicitly requested to be exempted from
the "kill all my processes" logic, and are expected to clean up after
themselves on their own. SysV scripts set that for compat reasons. I
cannot fix sysv for your sorry, it works like it works, and does not
clean up properly.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: Re: Antw: failing unmounts during reboot

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 12:52, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> > "try to kill all processes using a filesystem before unmounting it"
> > isn't that easy when it comes to namespaces, "lsof" even don't tell you
> > the root cause preventing unmount but the ernel still refuses to do so
>
> Does systemd even try to use lsof?

No, of course not. We tend to avoid hacks like that.

We generally expect packages to come with proper ordering in place,
and when they still insist on SysV init scripts to work properly and
clean up after themselves though.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: failing unmounts during reboot

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 12:08, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> *1: I have a support call open with SUSE:
> Before systemd (almost) all processes were killed before unmounting.
> With systemd I'm seeing excessive reboot delays due to unmount timing out. 
> For example if you have a process started from NFS that has a log file on NFS 
> open, too.
> It seems the order is roughly like this:
> 1) Shutdown the network
> 2) Try unmounting filesystems, including NFS
> 3) Kill remaining processes

That's a bug in Suse. We provide all the right hooks so that any kind
of networking solution can be plugged in to the right places. Since we
don't ship those we can't make sure they are properly ordered though,
it's an integration issue your distro has to take care of.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Lennart Poettering
On Do, 25.07.19 11:40, Frank Steiner (fsteiner-ma...@bio.ifi.lmu.de) wrote:

> Hi,
>
> I'm currently discussing a problem with the SuSE support about failing
> unmounts during reboot. Tyring to debug this I realized that systemd
> is not killing processes left over by some init.d script. E.g. use
> the following script in /etc/init.d/
>
> #!/bin/sh
> #
> ### BEGIN INIT INFO
> # Provides: bla
> # Required-Start: $network $remote_fs sshd
> # Required-Stop: $network $remote_fs sshd
> # Default-Start:  2 3 5
> # Description:test
> ### END INIT INFO
> case "$1" in
>  start) cd /test; /usr/bin/sleep 99d & ;;
>   stop) true;;
> esac

When generating native unit files from legacy sysv scripts, we use
KillMode=process, which means we'll only kill the main process, and
nothing else. This choice was made since its behaviour comes closest
to classic SysV behaviour (since there the init system didn't kill any
auxiliary processes either).

Given it's 2019 it might be wise to just write a native unit file if
you want better control of this. Note that for native unit files we
use different defaults: there we kill everything by default.

You can also reuse the generated unit, but only change the KillMode=
setting, by creating a drop-in in
/etc/systemd/system/.service.d/.conf, and then
adding there:

[Service]
KillMode=control-group

But let me underline: a SysV script which leaves processes around is
simply buggy. In sysv it was expected that scripts would clean up
properly on "stop", on their own. If you don't do that in your script,
then maybe fix that...

> On shutdown, unmounting /test will fail because the sleep process is
> not killed. Shouldn't there be a mechanism in system to kill processes
> spawned by LSB script when shutting these down?

Well, quite frankly, either way we do it people will be upset. If we
kill all processes of a service on stop, people tell us "sysv didn't
kill all processes on service stop, why do you"? Now you say the
opposite: "why don't you kill all service processes on stop, you
should!", but there's no way out.

If you ask me, just forget about SysV init scripts in 2019, and spent
the 15min to just put together a native unit. It will save you
frustration in the long run, and fixes all these issues.

Also note that we live in a world where various kinds of storage
(including much of NFS) requires local services running to
operate. Because of that we can't just decide "oh, its time to tear
down NFS, let's kill *every* process by PID 1", because then the whole
system will be borked.

The only correct way to shut things down is with individual ordering
between units really, but that falls apart if you stick to historic
sysv semantics too much.

>
> And moreover, wouldn't it make sense to have a mechanism to at least
> try to kill all processes using a filesystem before unmounting it?

There's no sensbible APi for that. Moreover this should be entirely
unnecessary with correctly behaving services. It's just that you wrote
a broken one...

> We often see failing unmounts of several local or iscsi fs during
> reboot, and in the support case we are currently working on with SuSE
> failing iscsi fs even cause xfs I/O errors. So it might be a good idea
> to have sth. like a lsof + kill before unmounting a filesystem, maybe
> configurable with a flag to enable or disable it. Even if lsof or kill
> failed, it wouldn't be worse than now.

lsof is a) slow (it searches all of /proc), b) racy (because it won't
properly grok fds coming and going), and c) incomplete (we live in a
world if pidns these days). This is a hack on top of a hack really,
let's not do that.

> As far as I see there is no way to write a drop-in for a mount unit
> that allows to execute commands before the unmount happens, is that
> right? Sth. like "ExecPreUmount=" would help here, especially if there
> was sth. like a umount@.service that would be called for every umount
> with e.g. the mounpoint accessable with a variable.

We didn't add that on purpose, since we wanted to make sure that what
systemd does is mostly reproducible with a plain "mount" command on
the shell...

You can manually do this though, but it's a hack really: just write a
service, order it After= the specific mount, and
Before=local-fs.target. But it's going to be super racy, and a poor
hack against missing ordering deps.

Long story short: fix your deps, write proper units.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Silvio Knizek wrote:


the proper approach would be to define the dependency of the generated
.service to the mount point with a drop-in and RequiresMountsFor=. See
man:systemd.unit for more information.


How would that work for init.d scripts?


Also the systemd-sysv-generator can only do so much.


1) If the systemd-sysv-generator creates wrapper .service files on-the-fly,
   is it possible to hook into those with dropins? The man page didn't
   give much information.

2) If init.d scripts are wrapped by .service files, shouldn't the
   processes spawned by these scripts be killed when shutting down?

   For my example init.d script I see this with "sytemctl status bla":
  CGroup: /system.slice/bla.service
   └─9997 /usr/bin/sleep 99d

   But after "systemctl stop bla", the sleep process is still there.

   When I write a .service unit of type oneshot with RemainAfterExit=yes
   that calls a script that spawns another sleep process, I see the same
   structure:
   CGroup: /system.slice/blu.service
   └─24351 sleep 44d

   But after "systemctl stop blu" this sleep is gone.

   So why doesn't the .service wrapper for the init.d script
   kill the processes it has spawned? Is that a bug?

   That would likely solve most of the problems with init.d scripts
   even for iscsi mounts due to the ordering that can be done with
   $remote_fs.


Please write yourself proper units if you encounter problems.


I'm not writing init.d script, I always use units for my own
stuff. I'm talking about init.d scripts that are still delivered with
many software packages and that I wouldn't like or try to rewrite.

cu,
Frank

--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Logs from a service is not showing up in journalctl but showing up in syslog

2019-07-25 Thread Mantas Mikulėnas
On Thu, Jul 25, 2019 at 1:26 PM Debraj Manna 
wrote:

> I have unit file which looks like below. I am seeing some of the echo are
> showing up in syslog but not in journalctl. Can someone let me know what is
> going on?
> systemd version 229 running on Ubuntu 16.
>
> [Unit]
> Description=Kafka Service
>
> [Service]
> Type=simple
> Environment=KAFKA_HOME=/home/ubuntu/deploy/kafka
> Environment=LIB_DIR=/var/lib/kafka
> Environment=LOG_DIR=/var/log/kafka
> Environment=TEMP_DIR=/home/ubuntu/tmp
>
> Environment=TOOLS_JAR=/home/ubuntu/build-target/common-utils/tools-0.001-SNAPSHOT.jar
> Environment=MIN_DATA_PARTITION_FREE_SPACE_PCT=10
> Environment=MIN_DATA_PARTITION_FREE_SPACE_GB=10
> Environment=DATA_PARTITION_NAME=/var
>
> ExecStartPre=-/bin/mkdir -p /var/log/kafka
> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/log/kafka
> ExecStartPre=-/bin/mkdir -p /var/lib/kafka/kafka-logs
> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/lib/kafka/kafka-logs
> ExecStartPre=-/bin/rm -f /var/log/kafka/kafka-logs/.lock
> ExecStartPre=-/bin/mkdir -p /home/ubuntu/tmp
> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /home/ubuntu/tmp
> ExecStartPre=-/bin/chmod -R 775 /home/ubuntu/tmp
> ExecStartPre=-/bin/su ubuntu -c
> "/home/ubuntu/build-target/kafka/kafka-systemd-prestart.sh"
> ExecStart=/bin/su ubuntu -c
> "/home/ubuntu/build-target/kafka/kafka-systemd-health.sh"
>
> [...]

> Doing sudo journalctl -u kafka.service looks like below
>
> Jul 25 07:41:39 platform2 systemd[1]: Started Kafka Service.
> Jul 25 07:41:39 platform2 su[39160]: Successful su for ubuntu by root
> Jul 25 07:41:39 platform2 su[39160]: + ??? root:ubuntu
> Jul 25 07:41:39 platform2 su[39160]: pam_unix(su:session): session opened for 
> user ubuntu by (uid=0)
> Jul 25 07:41:40 platform2 bash[39192]: [Jul 25 2019 07:41:40-572] Exiting 
> kafka...
>
> I am not seeing some of the echo from kafka-systemd-prestart.sh in journatl 
> but I am seeing those logs in syslog
>
> Jul 25 10:17:03 platform2 su[38464]: WatchedEvent state:SyncConnected 
> type:None path:null
> Jul 25 10:17:03 platform2 su[38464]: Node does not exist: /brokers/ids/2
> Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-343] partition 
> /var free% 9 required% 10 freegb 134 required 10
> Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-344] Sufficient 
> disk space is not available, sleeping for 60 seconds before exiting...
>
>
Take a look at `journalctl -o verbose SYSLOG_IDENTIFIER=su _PID=38464`.

I suspect the messages *are* in the journal, just not tagged with
UNIT=kafka.service anymore. In some distros, `su` is actually configured to
call pam_systemd and set up a new systemd-logind session – when this
happens, the process is moved out of kafka.service into a user session
scope, and its syslog messages are grouped accordingly.

Consider replacing `su` with `runuser`, or indeed with systemd's [Service]
User= option.

-- 
Mantas Mikulėnas
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Mantas Mikulėnas
On Thu, Jul 25, 2019 at 12:49 PM Frank Steiner <
fsteiner-ma...@bio.ifi.lmu.de> wrote:

> Hi,
>
> I'm currently discussing a problem with the SuSE support about failing
> unmounts during reboot. Tyring to debug this I realized that systemd
> is not killing processes left over by some init.d script. E.g. use
> the following script in /etc/init.d/
>
> #!/bin/sh
> #
> ### BEGIN INIT INFO
> # Provides: bla
> # Required-Start: $network $remote_fs sshd
> # Required-Stop: $network $remote_fs sshd
> # Default-Start:  2 3 5
> # Description:test
> ### END INIT INFO
> case "$1" in
>   start) cd /test; /usr/bin/sleep 99d & ;;
>stop) true;;
> esac
>
>
> On shutdown, unmounting /test will fail because the sleep process is
> not killed. Shouldn't there be a mechanism in system to kill processes
> spawned by LSB script when shutting these down?
>

I think that's a deliberate decision made in the systemd-sysv-generator.
Note how the generated .service files have "KillMode=process" (and even
"RemainAfterExit=yes"). The default for native services is to kill the
entire cgroup, and IIRC that even was one of the main reasons for using
cgroups.

Most likely it's there to retain compatibility with some of the weirder
init.d scripts – those which don't start any daemons; those which start
several; and so on and so on.


>
> And moreover, wouldn't it make sense to have a mechanism to at least
> try to kill all processes using a filesystem before unmounting it?
> We often see failing unmounts of several local or iscsi fs during
> reboot, and in the support case we are currently working on with SuSE
> failing iscsi fs even cause xfs I/O errors. So it might be a good idea
> to have sth. like a lsof + kill before unmounting a filesystem, maybe
> configurable with a flag to enable or disable it. Even if lsof or kill
> failed, it wouldn't be worse than now.
>

IIRC the idea is that user sessions are killed during shutdown before
unmounting anything anyway, and services have implicit
After=local-fs.target so they get stopped before local filesystems? The
shutdown process is one area that I still know very little about.


>
> As far as I see there is no way to write a drop-in for a mount unit
> that allows to execute commands before the unmount happens, is that
> right? Sth. like "ExecPreUmount=" would help here, especially if there
> was sth. like a umount@.service that would be called for every umount
> with e.g. the mounpoint accessable with a variable.
>

No, there isn't.

-- 
Mantas Mikulėnas
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Reindl Harald wrote:


"try to kill all processes using a filesystem before unmounting it"
isn't that easy when it comes to namespaces, "lsof" even don't tell you
the root cause preventing unmount but the ernel still refuses to do so


Agreed! I've seen that already when trying to unmount manually. But it
does help in many cases, so lsof+kill could be helpful, even if it's not
a perfect solution. Or at least having the possibility to do sth. like
that on my own via a drop-in.

--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Antw: Re: Antw: failing unmounts during reboot

2019-07-25 Thread Ulrich Windl
>>> Reindl Harald  schrieb am 25.07.2019 um 12:16 in
Nachricht <1bd5766b-018c-10f9-b2b2-46fc78ddd...@thelounge.net>:

> Am 25.07.19 um 12:08 schrieb Ulrich Windl:
>> Before systemd (almost) all processes were killed before unmounting
> 
> that's often a transfigured point of view
> 
> many issues where there also before systemd but never got noticed and

Believe me I would have noticed of NFS /home failed to unmount during reboot!

> after change to systemd they got visible, just because a sysvinit don't
> tell you about a problem because of missing error handling don't mean
> there is none

They became visible, because trhe did not exist before!

> 
> the same with emergeny shell which covers nearly all cases where you
> needed to start a livesystem before and now you have a working
> environment where you can fix the root cause way quicker

I completely disagree.

> 
> "try to kill all processes using a filesystem before unmounting it"
> isn't that easy when it comes to namespaces, "lsof" even don't tell you
> the root cause preventing unmount but the ernel still refuses to do so

Does systemd even try to use lsof?

> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel 




___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Antw: Re: failing unmounts during reboot

2019-07-25 Thread Ulrich Windl
>>> Silvio Knizek  schrieb am 25.07.2019 um 12:10 in
Nachricht :

[...]
> Hi,
> 
> the proper approach would be to define the dependency of the generated
> .service to the mount point with a drop-in and RequiresMountsFor=. See
> man:systemd.unit for more information.
> Also the systemd-sysv-generator can only do so much. Please write
> yourself proper units if you encounter problems.

What this "solution" fails to see is that any user can start a process that may 
prevent clean unmount. It's completely far away from reality to believe that 
such a user will write (or even know how to write) a systemd service!

> 
> BR
> Silvio
> 
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel 




___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Logs from a service is not showing up in journalctl but showing up in syslog

2019-07-25 Thread Debraj Manna
Thanks Silvio for replying. I will check your suggestions.

But it appears this is some issue with systemd version 229 as mentioned in
https://unix.stackexchange.com/a/417632

On Thu, Jul 25, 2019 at 4:09 PM Silvio Knizek  wrote:

> Am Donnerstag, den 25.07.2019, 15:55 +0530 schrieb Debraj Manna:
> > I have unit file which looks like below. I am seeing some of the echo
> > are
> > showing up in syslog but not in journalctl. Can someone let me know
> > what is
> > going on?
> > systemd version 229 running on Ubuntu 16.
> >
> > [Unit]
> > Description=Kafka Service
> >
> > [Service]
> > Type=simple
> > Environment=KAFKA_HOME=/home/ubuntu/deploy/kafka
> > Environment=LIB_DIR=/var/lib/kafka
> > Environment=LOG_DIR=/var/log/kafka
> > Environment=TEMP_DIR=/home/ubuntu/tmp
> > Environment=TOOLS_JAR=/home/ubuntu/build-target/common-utils/tools-
> > 0.001-SNAPSHOT.jar
> > Environment=MIN_DATA_PARTITION_FREE_SPACE_PCT=10
> > Environment=MIN_DATA_PARTITION_FREE_SPACE_GB=10
> > Environment=DATA_PARTITION_NAME=/var
> >
> > ExecStartPre=-/bin/mkdir -p /var/log/kafka
> > ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/log/kafka
> > ExecStartPre=-/bin/mkdir -p /var/lib/kafka/kafka-logs
> > ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/lib/kafka/kafka-logs
> > ExecStartPre=-/bin/rm -f /var/log/kafka/kafka-logs/.lock
> > ExecStartPre=-/bin/mkdir -p /home/ubuntu/tmp
> > ExecStartPre=-/bin/chown -R ubuntu:ubuntu /home/ubuntu/tmp
> > ExecStartPre=-/bin/chmod -R 775 /home/ubuntu/tmp
> > ExecStartPre=-/bin/su ubuntu -c
> > "/home/ubuntu/build-target/kafka/kafka-systemd-prestart.sh"
> > ExecStart=/bin/su ubuntu -c
> > "/home/ubuntu/build-target/kafka/kafka-systemd-health.sh"
> > ExecStopPost=-/bin/bash
> > /home/ubuntu/build-target/kafka/kafka-systemd-poststop.sh
> > RestartSec=2s
> > Restart=always
> > LimitNOFILE=65535
> > KillSignal=SIGTERM
> > SendSIGKILL=no
> > SuccessExitStatus=1 143
> >
> > [Install]
> > WantedBy=multi-user.target
> >
> > kafka-systemd-prestart.sh looks like below
> >
> > echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] Starting kafka..."
> > timeout --signal=sigkill 600s java -cp "$TOOLS_JAR"
> > com.vnera.tools.kafka.KafkaIndexValidator "$LIB_DIR/kafka-logs"
> > "$KAFKA_HOME/config/server.properties" true
> > broker_id=`sudo grep -F "broker.id"
> > $KAFKA_HOME/config/server.properties | awk -F '=' '{print $2}'`
> > zookeeper_list=`sudo grep -F "zookeeper.connect="
> > $KAFKA_HOME/config/server.properties | awk -F '=' '{print $2}'`
> > echo "attempting removal of broker id $broker_id"
> >
> > used_pct=`df ${DATA_PARTITION_NAME} --output=pcent | grep -v Use |
> > grep -o '[^ ]*[^ %]'`
> > free_pct=$(expr 100 - $used_pct)
> > free_gb=`df -h ${DATA_PARTITION_NAME} --output=avail --block-size G |
> > grep -v Avail | grep -o '[^ ]*[^ G]'`
> > echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] partition
> > ${DATA_PARTITION_NAME} free% $free_pct required%
> > ${MIN_DATA_PARTITION_FREE_SPACE_PCT} freegb ${free_gb} required
> > ${MIN_DATA_PARTITION_FREE_SPACE_GB}"
> >
> > # Some other code
> >
> > kafka-systemd-poststop.sh looks like below
> >
> > ---
> >
> > echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] Exiting kafka..."
> > cmd="ps -ef | grep -v grep | grep kafkaServer"
> >
> > # Some other code
> >
> > echo "completed exiting kafka"
> >
> > Doing sudo journalctl -u kafka.service looks like below
> >
> > Jul 25 07:41:39 platform2 systemd[1]: Started Kafka Service.
> > Jul 25 07:41:39 platform2 su[39160]: Successful su for ubuntu by root
> > Jul 25 07:41:39 platform2 su[39160]: + ??? root:ubuntu
> > Jul 25 07:41:39 platform2 su[39160]: pam_unix(su:session): session
> > opened for user ubuntu by (uid=0)
> > Jul 25 07:41:40 platform2 bash[39192]: [Jul 25 2019 07:41:40-572]
> > Exiting kafka...
> >
> > I am not seeing some of the echo from kafka-systemd-prestart.sh in
> > journatl but I am seeing those logs in syslog
> >
> > Jul 25 10:17:03 platform2 su[38464]: WatchedEvent state:SyncConnected
> > type:None path:null
> > Jul 25 10:17:03 platform2 su[38464]: Node does not exist:
> > /brokers/ids/2
> > Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-343]
> > partition /var free% 9 required% 10 freegb 134 required 10
> > Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-344]
> > Sufficient disk space is not available, sleeping for 60 seconds
> > before
> > exiting...
>
> Hi,
>
> first of all, take a look at man:tmpfiles.d to replace the whole
> mkdir/chmod ExecStartPre= stuff.
> Second, don't use »su« in .service. It breaks stuff by creating a new
> cgroup hierarchy because it's run through pam. Use User= instead.
> With both this changes your shell scripts shouldn't be necessary at all
> and than everything should land in the journal.
> Please don't re-invent the stuff systemd is already providing.
>
> BR
> Silvio
>
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Logs from a service is not showing up in journalctl but showing up in syslog

2019-07-25 Thread Silvio Knizek
Am Donnerstag, den 25.07.2019, 15:55 +0530 schrieb Debraj Manna:
> I have unit file which looks like below. I am seeing some of the echo
> are
> showing up in syslog but not in journalctl. Can someone let me know
> what is
> going on?
> systemd version 229 running on Ubuntu 16.
>
> [Unit]
> Description=Kafka Service
>
> [Service]
> Type=simple
> Environment=KAFKA_HOME=/home/ubuntu/deploy/kafka
> Environment=LIB_DIR=/var/lib/kafka
> Environment=LOG_DIR=/var/log/kafka
> Environment=TEMP_DIR=/home/ubuntu/tmp
> Environment=TOOLS_JAR=/home/ubuntu/build-target/common-utils/tools-
> 0.001-SNAPSHOT.jar
> Environment=MIN_DATA_PARTITION_FREE_SPACE_PCT=10
> Environment=MIN_DATA_PARTITION_FREE_SPACE_GB=10
> Environment=DATA_PARTITION_NAME=/var
>
> ExecStartPre=-/bin/mkdir -p /var/log/kafka
> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/log/kafka
> ExecStartPre=-/bin/mkdir -p /var/lib/kafka/kafka-logs
> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/lib/kafka/kafka-logs
> ExecStartPre=-/bin/rm -f /var/log/kafka/kafka-logs/.lock
> ExecStartPre=-/bin/mkdir -p /home/ubuntu/tmp
> ExecStartPre=-/bin/chown -R ubuntu:ubuntu /home/ubuntu/tmp
> ExecStartPre=-/bin/chmod -R 775 /home/ubuntu/tmp
> ExecStartPre=-/bin/su ubuntu -c
> "/home/ubuntu/build-target/kafka/kafka-systemd-prestart.sh"
> ExecStart=/bin/su ubuntu -c
> "/home/ubuntu/build-target/kafka/kafka-systemd-health.sh"
> ExecStopPost=-/bin/bash
> /home/ubuntu/build-target/kafka/kafka-systemd-poststop.sh
> RestartSec=2s
> Restart=always
> LimitNOFILE=65535
> KillSignal=SIGTERM
> SendSIGKILL=no
> SuccessExitStatus=1 143
>
> [Install]
> WantedBy=multi-user.target
>
> kafka-systemd-prestart.sh looks like below
>
> echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] Starting kafka..."
> timeout --signal=sigkill 600s java -cp "$TOOLS_JAR"
> com.vnera.tools.kafka.KafkaIndexValidator "$LIB_DIR/kafka-logs"
> "$KAFKA_HOME/config/server.properties" true
> broker_id=`sudo grep -F "broker.id"
> $KAFKA_HOME/config/server.properties | awk -F '=' '{print $2}'`
> zookeeper_list=`sudo grep -F "zookeeper.connect="
> $KAFKA_HOME/config/server.properties | awk -F '=' '{print $2}'`
> echo "attempting removal of broker id $broker_id"
>
> used_pct=`df ${DATA_PARTITION_NAME} --output=pcent | grep -v Use |
> grep -o '[^ ]*[^ %]'`
> free_pct=$(expr 100 - $used_pct)
> free_gb=`df -h ${DATA_PARTITION_NAME} --output=avail --block-size G |
> grep -v Avail | grep -o '[^ ]*[^ G]'`
> echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] partition
> ${DATA_PARTITION_NAME} free% $free_pct required%
> ${MIN_DATA_PARTITION_FREE_SPACE_PCT} freegb ${free_gb} required
> ${MIN_DATA_PARTITION_FREE_SPACE_GB}"
>
> # Some other code
>
> kafka-systemd-poststop.sh looks like below
>
> ---
>
> echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] Exiting kafka..."
> cmd="ps -ef | grep -v grep | grep kafkaServer"
>
> # Some other code
>
> echo "completed exiting kafka"
>
> Doing sudo journalctl -u kafka.service looks like below
>
> Jul 25 07:41:39 platform2 systemd[1]: Started Kafka Service.
> Jul 25 07:41:39 platform2 su[39160]: Successful su for ubuntu by root
> Jul 25 07:41:39 platform2 su[39160]: + ??? root:ubuntu
> Jul 25 07:41:39 platform2 su[39160]: pam_unix(su:session): session
> opened for user ubuntu by (uid=0)
> Jul 25 07:41:40 platform2 bash[39192]: [Jul 25 2019 07:41:40-572]
> Exiting kafka...
>
> I am not seeing some of the echo from kafka-systemd-prestart.sh in
> journatl but I am seeing those logs in syslog
>
> Jul 25 10:17:03 platform2 su[38464]: WatchedEvent state:SyncConnected
> type:None path:null
> Jul 25 10:17:03 platform2 su[38464]: Node does not exist:
> /brokers/ids/2
> Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-343]
> partition /var free% 9 required% 10 freegb 134 required 10
> Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-344]
> Sufficient disk space is not available, sleeping for 60 seconds
> before
> exiting...

Hi,

first of all, take a look at man:tmpfiles.d to replace the whole
mkdir/chmod ExecStartPre= stuff.
Second, don't use »su« in .service. It breaks stuff by creating a new
cgroup hierarchy because it's run through pam. Use User= instead.
With both this changes your shell scripts shouldn't be necessary at all
and than everything should land in the journal.
Please don't re-invent the stuff systemd is already providing.

BR
Silvio

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Logs from a service is not showing up in journalctl but showing up in syslog

2019-07-25 Thread Debraj Manna
I have unit file which looks like below. I am seeing some of the echo are
showing up in syslog but not in journalctl. Can someone let me know what is
going on?
systemd version 229 running on Ubuntu 16.

[Unit]
Description=Kafka Service

[Service]
Type=simple
Environment=KAFKA_HOME=/home/ubuntu/deploy/kafka
Environment=LIB_DIR=/var/lib/kafka
Environment=LOG_DIR=/var/log/kafka
Environment=TEMP_DIR=/home/ubuntu/tmp
Environment=TOOLS_JAR=/home/ubuntu/build-target/common-utils/tools-0.001-SNAPSHOT.jar
Environment=MIN_DATA_PARTITION_FREE_SPACE_PCT=10
Environment=MIN_DATA_PARTITION_FREE_SPACE_GB=10
Environment=DATA_PARTITION_NAME=/var

ExecStartPre=-/bin/mkdir -p /var/log/kafka
ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/log/kafka
ExecStartPre=-/bin/mkdir -p /var/lib/kafka/kafka-logs
ExecStartPre=-/bin/chown -R ubuntu:ubuntu /var/lib/kafka/kafka-logs
ExecStartPre=-/bin/rm -f /var/log/kafka/kafka-logs/.lock
ExecStartPre=-/bin/mkdir -p /home/ubuntu/tmp
ExecStartPre=-/bin/chown -R ubuntu:ubuntu /home/ubuntu/tmp
ExecStartPre=-/bin/chmod -R 775 /home/ubuntu/tmp
ExecStartPre=-/bin/su ubuntu -c
"/home/ubuntu/build-target/kafka/kafka-systemd-prestart.sh"
ExecStart=/bin/su ubuntu -c
"/home/ubuntu/build-target/kafka/kafka-systemd-health.sh"
ExecStopPost=-/bin/bash
/home/ubuntu/build-target/kafka/kafka-systemd-poststop.sh
RestartSec=2s
Restart=always
LimitNOFILE=65535
KillSignal=SIGTERM
SendSIGKILL=no
SuccessExitStatus=1 143

[Install]
WantedBy=multi-user.target

kafka-systemd-prestart.sh looks like below

echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] Starting kafka..."
timeout --signal=sigkill 600s java -cp "$TOOLS_JAR"
com.vnera.tools.kafka.KafkaIndexValidator "$LIB_DIR/kafka-logs"
"$KAFKA_HOME/config/server.properties" true
broker_id=`sudo grep -F "broker.id"
$KAFKA_HOME/config/server.properties | awk -F '=' '{print $2}'`
zookeeper_list=`sudo grep -F "zookeeper.connect="
$KAFKA_HOME/config/server.properties | awk -F '=' '{print $2}'`
echo "attempting removal of broker id $broker_id"

used_pct=`df ${DATA_PARTITION_NAME} --output=pcent | grep -v Use |
grep -o '[^ ]*[^ %]'`
free_pct=$(expr 100 - $used_pct)
free_gb=`df -h ${DATA_PARTITION_NAME} --output=avail --block-size G |
grep -v Avail | grep -o '[^ ]*[^ G]'`
echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] partition
${DATA_PARTITION_NAME} free% $free_pct required%
${MIN_DATA_PARTITION_FREE_SPACE_PCT} freegb ${free_gb} required
${MIN_DATA_PARTITION_FREE_SPACE_GB}"

# Some other code

kafka-systemd-poststop.sh looks like below

---

echo "[`date +"%h %d %Y %H:%M:%S-%3N"`] Exiting kafka..."
cmd="ps -ef | grep -v grep | grep kafkaServer"

# Some other code

echo "completed exiting kafka"

Doing sudo journalctl -u kafka.service looks like below

Jul 25 07:41:39 platform2 systemd[1]: Started Kafka Service.
Jul 25 07:41:39 platform2 su[39160]: Successful su for ubuntu by root
Jul 25 07:41:39 platform2 su[39160]: + ??? root:ubuntu
Jul 25 07:41:39 platform2 su[39160]: pam_unix(su:session): session
opened for user ubuntu by (uid=0)
Jul 25 07:41:40 platform2 bash[39192]: [Jul 25 2019 07:41:40-572]
Exiting kafka...

I am not seeing some of the echo from kafka-systemd-prestart.sh in
journatl but I am seeing those logs in syslog

Jul 25 10:17:03 platform2 su[38464]: WatchedEvent state:SyncConnected
type:None path:null
Jul 25 10:17:03 platform2 su[38464]: Node does not exist: /brokers/ids/2
Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-343]
partition /var free% 9 required% 10 freegb 134 required 10
Jul 25 10:17:03 platform2 su[38464]: [Jul 25 2019 10:17:03-344]
Sufficient disk space is not available, sleeping for 60 seconds before
exiting...
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Reindl Harald


Am 25.07.19 um 12:10 schrieb Silvio Knizek:
> the proper approach would be to define the dependency of the generated
> .service to the mount point with a drop-in and RequiresMountsFor=. See
> man:systemd.unit for more information.
> Also the systemd-sysv-generator can only do so much. Please write
> yourself proper units if you encounter problems

FWIW: and do it with dropins in /etc/systemd/system/servicename.service.d/

something which was completly impossible with initscripts and even if
they had some place to customize it had no sane concept of ordering
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Antw: failing unmounts during reboot

2019-07-25 Thread Reindl Harald

Am 25.07.19 um 12:08 schrieb Ulrich Windl:
> Before systemd (almost) all processes were killed before unmounting

that's often a transfigured point of view

many issues where there also before systemd but never got noticed and
after change to systemd they got visible, just because a sysvinit don't
tell you about a problem because of missing error handling don't mean
there is none

the same with emergeny shell which covers nearly all cases where you
needed to start a livesystem before and now you have a working
environment where you can fix the root cause way quicker

"try to kill all processes using a filesystem before unmounting it"
isn't that easy when it comes to namespaces, "lsof" even don't tell you
the root cause preventing unmount but the ernel still refuses to do so
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] failing unmounts during reboot

2019-07-25 Thread Silvio Knizek
Am Donnerstag, den 25.07.2019, 11:40 +0200 schrieb Frank Steiner:
> Hi,
>
> I'm currently discussing a problem with the SuSE support about
> failing
> unmounts during reboot. Tyring to debug this I realized that systemd
> is not killing processes left over by some init.d script. E.g. use
> the following script in /etc/init.d/
>
> #!/bin/sh
> #
> ### BEGIN INIT INFO
> # Provides: bla
> # Required-Start: $network $remote_fs sshd
> # Required-Stop: $network $remote_fs sshd
> # Default-Start:  2 3 5
> # Description:test
> ### END INIT INFO
> case "$1" in
>   start) cd /test; /usr/bin/sleep 99d & ;;
>stop) true;;
> esac
>
>
> On shutdown, unmounting /test will fail because the sleep process is
> not killed. Shouldn't there be a mechanism in system to kill
> processes
> spawned by LSB script when shutting these down?
>
> And moreover, wouldn't it make sense to have a mechanism to at least
> try to kill all processes using a filesystem before unmounting it?
> We often see failing unmounts of several local or iscsi fs during
> reboot, and in the support case we are currently working on with SuSE
> failing iscsi fs even cause xfs I/O errors. So it might be a good
> idea
> to have sth. like a lsof + kill before unmounting a filesystem, maybe
> configurable with a flag to enable or disable it. Even if lsof or
> kill
> failed, it wouldn't be worse than now.
>
> As far as I see there is no way to write a drop-in for a mount unit
> that allows to execute commands before the unmount happens, is that
> right? Sth. like "ExecPreUmount=" would help here, especially if
> there
> was sth. like a umount@.service that would be called for every umount
> with e.g. the mounpoint accessable with a variable.
>
> cu,
> Frank
>
Hi,

the proper approach would be to define the dependency of the generated
.service to the mount point with a drop-in and RequiresMountsFor=. See
man:systemd.unit for more information.
Also the systemd-sysv-generator can only do so much. Please write
yourself proper units if you encounter problems.

BR
Silvio

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] Antw: failing unmounts during reboot

2019-07-25 Thread Ulrich Windl
>>> Frank Steiner  schrieb am 25.07.2019 um 
>>> 11:40 in
Nachricht <3d410499-649b-f012-afd8-fcfbfef7a...@bio.ifi.lmu.de>:
> Hi,
> 
> I'm currently discussing a problem with the SuSE support about failing
> unmounts during reboot. Tyring to debug this I realized that systemd
> is not killing processes left over by some init.d script. E.g. use
> the following script in /etc/init.d/
> 
> #!/bin/sh
> #
> ### BEGIN INIT INFO
> # Provides: bla
> # Required-Start: $network $remote_fs sshd
> # Required-Stop: $network $remote_fs sshd
> # Default-Start:  2 3 5
> # Description:test
> ### END INIT INFO
> case "$1" in
>   start) cd /test; /usr/bin/sleep 99d & ;;
>stop) true;;
> esac
> 
> 
> On shutdown, unmounting /test will fail because the sleep process is
> not killed. Shouldn't there be a mechanism in system to kill processes
> spawned by LSB script when shutting these down?

*1: I have a support call open with SUSE:
Before systemd (almost) all processes were killed before unmounting.
With systemd I'm seeing excessive reboot delays due to unmount timing out. For 
example if you have a process started from NFS that has a log file on NFS open, 
too.
It seems the order is roughly like this:
1) Shutdown the network
2) Try unmounting filesystems, including NFS
3) Kill remaining processes

> 
> And moreover, wouldn't it make sense to have a mechanism to at least
> try to kill all processes using a filesystem before unmounting it?

+2!

> We often see failing unmounts of several local or iscsi fs during
> reboot, and in the support case we are currently working on with SuSE
> failing iscsi fs even cause xfs I/O errors. So it might be a good idea
> to have sth. like a lsof + kill before unmounting a filesystem, maybe
> configurable with a flag to enable or disable it. Even if lsof or kill
> failed, it wouldn't be worse than now.
> 
> As far as I see there is no way to write a drop-in for a mount unit
> that allows to execute commands before the unmount happens, is that
> right? Sth. like "ExecPreUmount=" would help here, especially if there
> was sth. like a umount@.service that would be called for every umount
> with e.g. the mounpoint accessable with a variable.

I'm not allowed to talk about systemd frustration here ;-)

> 
> cu,
> Frank
> 
> -- 
> Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/ 
> Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/ 
> LMU, Amalienstr. 17   Phone: +49 89 2180-4049
> 80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
> * Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel 




___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] failing unmounts during reboot

2019-07-25 Thread Frank Steiner

Hi,

I'm currently discussing a problem with the SuSE support about failing
unmounts during reboot. Tyring to debug this I realized that systemd
is not killing processes left over by some init.d script. E.g. use
the following script in /etc/init.d/

#!/bin/sh
#
### BEGIN INIT INFO
# Provides: bla
# Required-Start: $network $remote_fs sshd
# Required-Stop: $network $remote_fs sshd
# Default-Start:  2 3 5
# Description:test
### END INIT INFO
case "$1" in
 start) cd /test; /usr/bin/sleep 99d & ;;
  stop) true;;
esac


On shutdown, unmounting /test will fail because the sleep process is
not killed. Shouldn't there be a mechanism in system to kill processes
spawned by LSB script when shutting these down?

And moreover, wouldn't it make sense to have a mechanism to at least
try to kill all processes using a filesystem before unmounting it?
We often see failing unmounts of several local or iscsi fs during
reboot, and in the support case we are currently working on with SuSE
failing iscsi fs even cause xfs I/O errors. So it might be a good idea
to have sth. like a lsof + kill before unmounting a filesystem, maybe
configurable with a flag to enable or disable it. Even if lsof or kill
failed, it wouldn't be worse than now.

As far as I see there is no way to write a drop-in for a mount unit
that allows to execute commands before the unmount happens, is that
right? Sth. like "ExecPreUmount=" would help here, especially if there
was sth. like a umount@.service that would be called for every umount
with e.g. the mounpoint accessable with a variable.

cu,
Frank

--
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. BioinformatikMail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17   Phone: +49 89 2180-4049
80333 Muenchen, Germany   Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel