Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
All of these have status R and D for their duration, and while all get
a SIGKILL from systemd on logout, none of the processes change status
or die until their kernel task is done. And each of these operations
complete successfully with no worse for the wear.

btrfs balance &
btrfs dev rem &
btrfs replace start

Only 'btrfs scrub' has status S, and once it gets SIGKILL, it goes Z
and all of its accounting is wrong. But the kernel tasks continue and
appear to complete.

I did all of this with a btrfs raid5, 3 and 4 disks, in a libvirt VM.

---
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
Aug 01 09:29:59 localhost.localdomain sudo[1875]:chris : TTY=pts/1
; PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs scrub start /mnt/x
Aug 01 09:30:16 localhost.localdomain systemd[1]: user@1000.service:
Killing process 1883 (btrfs) with signal SIGKILL.
Aug 01 09:43:34 localhost.localdomain sudo[2574]:chris : TTY=pts/1
; PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs scrub start /mnt/x
Aug 01 09:43:53 localhost.localdomain systemd[1]: user@1000.service:
Killing process 2579 (btrfs) with signal SIGKILL.
Aug 01 11:41:00 localhost.localdomain sudo[3479]:chris : TTY=pts/1
; PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs fi show
Aug 01 11:41:04 localhost.localdomain sudo[3492]:chris : TTY=pts/1
; PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs balance start
/mnt/x
Aug 01 11:41:14 localhost.localdomain kernel: BTRFS info (device vdb):
relocating block group 24800919552 flags 130
Aug 01 11:41:14 localhost.localdomain kernel: BTRFS info (device vdb):
relocating block group 23727177728 flags 132
Aug 01 11:41:14 localhost.localdomain kernel: BTRFS info (device vdb):
found 6 extents
Aug 01 11:41:14 localhost.localdomain kernel: BTRFS info (device vdb):
relocating block group 21512585216 flags 129
Aug 01 11:41:24 localhost.localdomain kernel: BTRFS info (device vdb):
found 27447 extents
Aug 01 11:41:26 localhost.localdomain kernel: BTRFS info (device vdb):
found 27446 extents
Aug 01 11:41:26 localhost.localdomain kernel: BTRFS info (device vdb):
relocating block group 19365101568 flags 129
Aug 01 11:41:36 localhost.localdomain systemd[1]: user@1000.service:
Killing process 3499 (btrfs) with signal SIGKILL.


It's using SIGKILL. The process goes Z for scrub but nothing happens
for balance. Weird. It's definitely not exempt though. Hmm, when I
don't filter the journal for btrfs...

Aug 01 11:54:38 localhost.localdomain systemd[3623]: Starting Exit the
Session...
Aug 01 11:54:38 localhost.localdomain systemd[3623]: Received
SIGRTMIN+24 from PID 4269 (kill).
Aug 01 11:54:38 localhost.localdomain systemd[1]: user@1000.service:
Killing process 4206 (sudo) with signal SIGKILL.
Aug 01 11:54:38 localhost.localdomain systemd[1]: user@1000.service:
Killing process 4213 (btrfs) with signal SIGKILL.
Aug 01 11:54:38 localhost.localdomain systemd[1]: Stopped User Manager
for UID 1000.
Aug 01 11:54:38 localhost.localdomain audit[1]: SERVICE_STOP pid=1
uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0
msg='unit=user@1000 comm="systemd" exe="/usr/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Aug 01 11:54:38 localhost.localdomain systemd[1]: Removed slice User
Slice of chris.

In this case PID 4213 is the process that's still flipping between
status R and D. Kill is sent, but ignored. But not for scrubbing...


Aug 01 11:58:21 localhost.localdomain systemd[4294]: Starting Exit the
Session...
Aug 01 11:58:21 localhost.localdomain systemd[4294]: Received
SIGRTMIN+24 from PID 4933 (kill).
Aug 01 11:58:21 localhost.localdomain systemd[4301]:
pam_unix(systemd-user:session): session closed for user chris
Aug 01 11:58:21 localhost.localdomain systemd[1]: user@1000.service:
Killing process 4866 (btrfs) with signal SIGKILL.
Aug 01 11:58:21 localhost.localdomain systemd[1]: Stopped User Manager
for UID 1000.
Aug 01 11:58:21 localhost.localdomain audit[1]: SERVICE_STOP pid=1
uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0
msg='unit=user@1000 comm="systemd" exe="/usr/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Aug 01 11:58:21 localhost.localdomain systemd[1]: Removed slice User
Slice of chris.


Must have something to do with the use of & with balance, which scrub
doesn't need to go to the background.




Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
On Mon, Aug 1, 2016 at 11:19 AM, Austin S. Hemmelgarn
 wrote:
> On 2016-08-01 13:15, Chris Murphy wrote:

>> I've been using balance with &, and when I logout, the btrfs command
>> continues to flip between status D and R, just like before logout and
>> it appears to complete. I still get status messages of the balance
>> after logout, in kernel messages.
>>
> Interesting, maybe balance is explicitly white-listed?  Either that, or it
> just ignores whatever signal systemd uses to kill stuff in this context (I
> initially thought SIGTERM, but SIGHUP would make more sense in this
> context), which wouldn't surprise me either.

I'm not aware of any program specific white listing method with
KillUserProcesses=yes. However, there is KillExcludeUsers which by
default is KillExcludeUsers=root. Everything I run as sudo appears in
top and ps as use root. So are these processes exempt? And if so, why
is btrfs scrub becoming a zombie process? I don't know if it's
appropriate, but I asked about it (no response yet), whether all
things sudo should just be moved out of the user session. In my own
head I don't associate sudo commands with my user or my user session,
and at least top and ps agree with the former, so why not have sudo'd
processes put in a different scope from the outset?


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Austin S. Hemmelgarn

On 2016-08-01 13:15, Chris Murphy wrote:

On Mon, Aug 1, 2016 at 10:58 AM, Austin S. Hemmelgarn
 wrote:

On 2016-08-01 12:19, Chris Murphy wrote:


On Mon, Aug 1, 2016 at 10:08 AM, Austin S. Hemmelgarn
 wrote:



MD and DM RAID handle this by starting kernel threads to do the scrub.
They
then store the info about the scrub in the array itself, so you can query
it
externally.  If you watch, neither of those commands runs longer than it
takes to start the operation, so there's nothing for systemd to kill.



pvmove continues to run and report progress so it can be killed off,
but it only polls for statistics, it's not actually recording them. So
even though it gets killed, subsequent pvmove command shows correct
statistics.


Because all that the pvmove command is doing is polling for statistics. It
actually works kind of like a scrub, all the actual work is done in the
kernel, the userspace component just handles reporting.  The difference is
that the move operation is accounted and mutexed in the kernel itself,
instead of userspace like scrub does.  This model is actually essentially
what I think scrub (and balance for that matter) should look like, and if
implemented right, we could actually store scrub results in the FS itself
(that is, in the metadata, not in special files or anything like that).



So that makes me wonder how btrfs device add and remove will behave,
if issued in a DE which is then logged out of. Those commands do not
return to prompt until they complete.


They work via balance, so they should behave the same as a balance command,
which means it will likely run part way then get cancelled because of the
SIGTERM to the userspace component (assuming of course that it is still
running when you log out).


I've been using balance with &, and when I logout, the btrfs command
continues to flip between status D and R, just like before logout and
it appears to complete. I still get status messages of the balance
after logout, in kernel messages.

Interesting, maybe balance is explicitly white-listed?  Either that, or 
it just ignores whatever signal systemd uses to kill stuff in this 
context (I initially thought SIGTERM, but SIGHUP would make more sense 
in this context), which wouldn't surprise me either.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
On Mon, Aug 1, 2016 at 10:58 AM, Austin S. Hemmelgarn
 wrote:
> On 2016-08-01 12:19, Chris Murphy wrote:
>>
>> On Mon, Aug 1, 2016 at 10:08 AM, Austin S. Hemmelgarn
>>  wrote:
>>
>>>
>>> MD and DM RAID handle this by starting kernel threads to do the scrub.
>>> They
>>> then store the info about the scrub in the array itself, so you can query
>>> it
>>> externally.  If you watch, neither of those commands runs longer than it
>>> takes to start the operation, so there's nothing for systemd to kill.
>>
>>
>> pvmove continues to run and report progress so it can be killed off,
>> but it only polls for statistics, it's not actually recording them. So
>> even though it gets killed, subsequent pvmove command shows correct
>> statistics.
>
> Because all that the pvmove command is doing is polling for statistics. It
> actually works kind of like a scrub, all the actual work is done in the
> kernel, the userspace component just handles reporting.  The difference is
> that the move operation is accounted and mutexed in the kernel itself,
> instead of userspace like scrub does.  This model is actually essentially
> what I think scrub (and balance for that matter) should look like, and if
> implemented right, we could actually store scrub results in the FS itself
> (that is, in the metadata, not in special files or anything like that).
>>
>>
>> So that makes me wonder how btrfs device add and remove will behave,
>> if issued in a DE which is then logged out of. Those commands do not
>> return to prompt until they complete.
>
> They work via balance, so they should behave the same as a balance command,
> which means it will likely run part way then get cancelled because of the
> SIGTERM to the userspace component (assuming of course that it is still
> running when you log out).

I've been using balance with &, and when I logout, the btrfs command
continues to flip between status D and R, just like before logout and
it appears to complete. I still get status messages of the balance
after logout, in kernel messages.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Austin S. Hemmelgarn

On 2016-08-01 12:19, Chris Murphy wrote:

On Mon, Aug 1, 2016 at 10:08 AM, Austin S. Hemmelgarn
 wrote:



MD and DM RAID handle this by starting kernel threads to do the scrub. They
then store the info about the scrub in the array itself, so you can query it
externally.  If you watch, neither of those commands runs longer than it
takes to start the operation, so there's nothing for systemd to kill.


pvmove continues to run and report progress so it can be killed off,
but it only polls for statistics, it's not actually recording them. So
even though it gets killed, subsequent pvmove command shows correct
statistics.
Because all that the pvmove command is doing is polling for statistics. 
It actually works kind of like a scrub, all the actual work is done in 
the kernel, the userspace component just handles reporting.  The 
difference is that the move operation is accounted and mutexed in the 
kernel itself, instead of userspace like scrub does.  This model is 
actually essentially what I think scrub (and balance for that matter) 
should look like, and if implemented right, we could actually store 
scrub results in the FS itself (that is, in the metadata, not in special 
files or anything like that).


So that makes me wonder how btrfs device add and remove will behave,
if issued in a DE which is then logged out of. Those commands do not
return to prompt until they complete.
They work via balance, so they should behave the same as a balance 
command, which means it will likely run part way then get cancelled 
because of the SIGTERM to the userspace component (assuming of course 
that it is still running when you log out).




Replace was implemented the way scrub should have been.  It's done entirely
in the kernel, and the userspace tools just start, stop and check status.
We should just get rid of the whole scrub state file crap and have a way to
query the last scrub status directly from the FS. That would fix this
particular issue, and make scrub more consistent with everything else (and
solve the stale scrub status bug too).


OK, I'll update the bug report.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
On Mon, Aug 1, 2016 at 10:19 AM, Chris Murphy  wrote:

> So that makes me wonder how btrfs device add and remove will behave,
> if issued in a DE which is then logged out of. Those commands do not
> return to prompt until they complete.

Strike add. That's fast. I'm concerned about dev delete/remove and also resize.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
On Mon, Aug 1, 2016 at 10:08 AM, Austin S. Hemmelgarn
 wrote:

>
> MD and DM RAID handle this by starting kernel threads to do the scrub. They
> then store the info about the scrub in the array itself, so you can query it
> externally.  If you watch, neither of those commands runs longer than it
> takes to start the operation, so there's nothing for systemd to kill.

pvmove continues to run and report progress so it can be killed off,
but it only polls for statistics, it's not actually recording them. So
even though it gets killed, subsequent pvmove command shows correct
statistics.

So that makes me wonder how btrfs device add and remove will behave,
if issued in a DE which is then logged out of. Those commands do not
return to prompt until they complete.


> Replace was implemented the way scrub should have been.  It's done entirely
> in the kernel, and the userspace tools just start, stop and check status.
> We should just get rid of the whole scrub state file crap and have a way to
> query the last scrub status directly from the FS. That would fix this
> particular issue, and make scrub more consistent with everything else (and
> solve the stale scrub status bug too).

OK, I'll update the bug report.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Austin S. Hemmelgarn

On 2016-08-01 11:46, Chris Murphy wrote:

OK I've created a new volume that's sufficiently large I can tell if
the kernel workers doing the scrub are also being killed off. First, I
do a scrub without logging out to get a time for an uninterrupted
scrub. And then initiate a scrub which I start timing, but then logout
of the DE and watch for the kernel workers to stop.

- The kernel workers are killed off within ~5 seconds of an
uninterrupted scrub. Conclusion is the scrub is still happening by the
kernel.
This makes sense, systemd is killing based on session ID, and the kernel 
workers have an sid of 0 (I think, it should be whatever the sid of 
kthreadd (PID 2) has).

- The btrfs process for the scrub isn't killed either, it's just
status Z for the entire length of the scrub.
Z means the process is dead, but nothing has called wait() or similar to 
get status info from it, so it was killed, it's just that nothing took 
the body to the morgue yet.

- While this scrubbing is happening, issuing a 'btrfs scrub status'
gets me consistently stale information. It's the same information from
the moment the DE was logged out.
This makes sense, because the userspace component updates this info (and 
that's _all_ it does).


[root@localhost ~]# btrfs scrub status /mnt/x
scrub status for 9f9e5e1f-8d5a-44a0-8f69-8a393fb7ff3c
scrub started at Mon Aug  1 09:29:59 2016, running for 00:00:15
total bytes scrubbed: 3.06GiB with 0 errors

Even a minute later this information is the same.

Once the zombie btrfs process dies off, and the kernel workers stop
working, I get this bogus status information:

[root@localhost ~]# btrfs scrub status /mnt/x
scrub status for 9f9e5e1f-8d5a-44a0-8f69-8a393fb7ff3c
scrub started at Mon Aug  1 09:29:59 2016, interrupted after
00:00:15, not running
total bytes scrubbed: 3.06GiB with 0 errors


Only the user process was interrupted. Not the scrub. Looks like only
the user process is writing out the statistics and status, so once it
goes zombie, there's no accounting, rather than accounting being done
independently via sysfs.

Can I resume this scrub? Yes. But that's also bogus because there
really isn't anything to resume. All that work was done already, it
just hasn't been accounted for.

So whether you want to call this a bug, or deeply suboptimal behavior,
I think that's splitting hairs. Neither mdadm nor LVM scrubs are
affected by this logout behavior and systemd killing off user
processes. I always get reliable scrub status information from either
'echo check md/sync_action' or 'lvchange --syncaction check' before
and after logging out of the DE from which the command was issued.
MD and DM RAID handle this by starting kernel threads to do the scrub. 
They then store the info about the scrub in the array itself, so you can 
query it externally.  If you watch, neither of those commands runs 
longer than it takes to start the operation, so there's nothing for 
systemd to kill.


And it's even inconsistent with btrfs replace where it continues to
give me correct status information from a tty shell even though the
replace command was issued in a DE, subsequently logged out of. So
'btrfs scrub' is inconsistent no matter how you look at it. It's a
bug.

Replace was implemented the way scrub should have been.  It's done 
entirely in the kernel, and the userspace tools just start, stop and 
check status.  We should just get rid of the whole scrub state file crap 
and have a way to query the last scrub status directly from the FS. 
That would fix this particular issue, and make scrub more consistent 
with everything else (and solve the stale scrub status bug too).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
On Mon, Aug 1, 2016 at 9:46 AM, Chris Murphy  wrote:

> - The kernel workers are killed off within ~5 seconds of an
> uninterrupted scrub.

i.e. the kernel workers are doing the same work. They aren't being
killed sooner as a result of logging out from the DE. The only
apparent change from logging out from the DE from which the scrub was
issued, is the btrfs process becomes status Z. It is in fact not being
killed, which itself is kinda interesting/unexpected.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Chris Murphy
OK I've created a new volume that's sufficiently large I can tell if
the kernel workers doing the scrub are also being killed off. First, I
do a scrub without logging out to get a time for an uninterrupted
scrub. And then initiate a scrub which I start timing, but then logout
of the DE and watch for the kernel workers to stop.

- The kernel workers are killed off within ~5 seconds of an
uninterrupted scrub. Conclusion is the scrub is still happening by the
kernel.
- The btrfs process for the scrub isn't killed either, it's just
status Z for the entire length of the scrub.
- While this scrubbing is happening, issuing a 'btrfs scrub status'
gets me consistently stale information. It's the same information from
the moment the DE was logged out.

[root@localhost ~]# btrfs scrub status /mnt/x
scrub status for 9f9e5e1f-8d5a-44a0-8f69-8a393fb7ff3c
scrub started at Mon Aug  1 09:29:59 2016, running for 00:00:15
total bytes scrubbed: 3.06GiB with 0 errors

Even a minute later this information is the same.

Once the zombie btrfs process dies off, and the kernel workers stop
working, I get this bogus status information:

[root@localhost ~]# btrfs scrub status /mnt/x
scrub status for 9f9e5e1f-8d5a-44a0-8f69-8a393fb7ff3c
scrub started at Mon Aug  1 09:29:59 2016, interrupted after
00:00:15, not running
total bytes scrubbed: 3.06GiB with 0 errors


Only the user process was interrupted. Not the scrub. Looks like only
the user process is writing out the statistics and status, so once it
goes zombie, there's no accounting, rather than accounting being done
independently via sysfs.

Can I resume this scrub? Yes. But that's also bogus because there
really isn't anything to resume. All that work was done already, it
just hasn't been accounted for.

So whether you want to call this a bug, or deeply suboptimal behavior,
I think that's splitting hairs. Neither mdadm nor LVM scrubs are
affected by this logout behavior and systemd killing off user
processes. I always get reliable scrub status information from either
'echo check md/sync_action' or 'lvchange --syncaction check' before
and after logging out of the DE from which the command was issued.

And it's even inconsistent with btrfs replace where it continues to
give me correct status information from a tty shell even though the
replace command was issued in a DE, subsequently logged out of. So
'btrfs scrub' is inconsistent no matter how you look at it. It's a
bug.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-08-01 Thread Austin S. Hemmelgarn

On 2016-07-30 20:29, Chris Murphy wrote:

On Sat, Jul 30, 2016 at 2:02 PM, Chris Murphy  wrote:

Short version: When systemd-logind login.conf KillUserProcesses=yes,
and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and


Same thing with Xfce, so it's not DE specific. (Unsuprising.)

I inflated the size of the test volume, and it seems pretty clear that
the scrub is not completing, as the kernel threads stop sooner when
logging out vs not logging out. So the status reporting an
interruption appears to be valid for the net operation, not merely the
user space tool being interrupted.
You have your terminals set to start the shell as a login shell I'm 
guessing.  That's probably why closing the terminal window is triggering 
systemd's process killing.  It will of course still trigger when you 
close the graphical session though.  Personally, this is yet another 
reason for me to not like systemd.  This setting breaks traditional UNIX 
userspace semantics.


Personally, I'm with Duncan on this one though, if resume works 
correctly, then it's not a bug, just a bad interaction between an 
administrative tool designed for a server and an init system designed 
for a desktop.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-07-31 Thread Duncan
Chris Murphy posted on Sat, 30 Jul 2016 14:02:17 -0600 as excerpted:

> Short version: When systemd-logind login.conf KillUserProcesses=yes, and
> the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and then
> logs out of the shell, the user space operation is killed, and btrfs
> scrub status reports that the scrub was aborted. [1]

What does btrfs scrub resume do?  Resume, or error?

If it resumes, I'd say RESOLVED/NOTABUG as both that systemd option and 
btrfs scrub appear to be working as intended.  If it doesn't, then 
there's definitely a btrfs bug, even if you argue it's only in the 
documentation, because the manpage (tho still 4.6.1, here) says it 
resumes an interrupted scrub but won't start a new one if the scrub 
finished successfully, and an abort is definitely an interruption, not a 
successful finish.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-07-31 Thread Chris Murphy
On Sun, Jul 31, 2016 at 4:56 AM, Gabriel C  wrote:
>
>
> On 30.07.2016 22:02, Chris Murphy wrote:
>> Short version: When systemd-logind login.conf KillUserProcesses=yes,
>> and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and
>> then logs out of the shell, the user space operation is killed, and
>> btrfs scrub status reports that the scrub was aborted. [1]
>>
>
> How this is a bug ?

If the privilege escalated operation (kernel threads included) are
clobbered, then it's a bug because there's every reason for a user to
issue this command that could take hours or days, and not have to stay
logged into to their GUI shell session while it runs, for example over
the weekend. Yes of course they could schedule it but saying they
could do it another way doesn't fix the use case of doing it manually.

If the operation continues, and just the user space command is killed
off, it's a bug because the statistics and status of the scrub are
lost to future status checks; that is, "interrupted" is sufficiently
misleading that it's false. The operation did continue, we've just
lost the conclusion.

Balance and replace, while user process is killed, kernel process
continues, and it's still possible for a user to get current (and
correct) status information for both.

Further it's arguably a regression compared to equivalent mdadm and
LVM RAID behaviors.


>
> Is excatly what 'KillUserProcesses=yes' is extected to do..
>

No, it basically breaks scrub initiated within a user's GUI session
and that is in no way intended by anyone, it's a side effect. The
question is how to fix it, not debate whether it's a bug, that's
ridiculous.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-07-31 Thread Gabriel C


On 30.07.2016 22:02, Chris Murphy wrote:
> Short version: When systemd-logind login.conf KillUserProcesses=yes,
> and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and
> then logs out of the shell, the user space operation is killed, and
> btrfs scrub status reports that the scrub was aborted. [1]
> 

How this is a bug ?

Is excatly what 'KillUserProcesses=yes' is extected to do..

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd KillUserProcesses=yes and btrfs scrub

2016-07-30 Thread Chris Murphy
On Sat, Jul 30, 2016 at 2:02 PM, Chris Murphy  wrote:
> Short version: When systemd-logind login.conf KillUserProcesses=yes,
> and the user does "sudo btrfs scrub start" in e.g. GNOME Terminal, and

Same thing with Xfce, so it's not DE specific. (Unsuprising.)

I inflated the size of the test volume, and it seems pretty clear that
the scrub is not completing, as the kernel threads stop sooner when
logging out vs not logging out. So the status reporting an
interruption appears to be valid for the net operation, not merely the
user space tool being interrupted.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html