Hello, Oleg.

Sorry about the delay.

On Tue, Nov 10, 2015 at 09:20:17PM +0100, Oleg Nesterov wrote:
> > We simply need to reimplement cgroup freezer so that its userland
> > visible state is well defined (most likely jobctl stop).  Right now,
> > it's allowing userland to trigger "stuck somewhere in the kernel"
> > condition, so interactions with frozen tasks are naturally broken.
> 
> I agree, the freezer is not perfect, and it needs changes.
> 
> Still I think this needs a fix in ptrace code. At least we should not
> wait in TASK_UNINTERRUPTIBLE state.
> 
> And perhaps we can simply remove this logic? I forgot why do we hide this
> STOPPED -> RUNNING -> TRACED transition from the attaching thread. But the
> vague feeling tells me that we discussed this before and perhaps it was me
> who suggested to avoid the user-visible change when you introduced this
> transition...

Heh, it was too long ago for me to remember much. :)

> Anyway, now I do not understand why do we want to hide it. Lets consider
> the following "test-case",
> 
>       void test(int pid)
>       {
>               kill(pid, SIGSTOP);
>               waitpid(pid, NULL, WSTOPPED);
> 
>               ptrace(PTRACE_ATTACH-OR-PTRACE_SEIZE, pid, 0,0);
> 
>               assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);
>       }
> 
> Yes, it will fail if we remove JOBCTL_TRAPPING. But it can equally fail
> if SIGCONT comes before ATTACH, so perhaps we do not really care?
> 
> Jan, Pedro, do you think the patch below can break gdb somehow? With this
> patch you can never assume that waitpid(WNOHANG) or ptrace(WHATEVER) will
> succeed right after PTRACE_ATTACH/PTRACE_SEIZE, even if you know that the
> tracee was TASK_STOPPED before attach.
> 
> Tejun, do you see any reason to keep JOBCTL_TRAPPING?

Hmmm... It's nasty tho.  We're breaking a guaranteed userland behavior
to mask a deficiency (IMHO it's an outright bug) in a different
subsystem.  The problem here is that cgroup-frozen threads become
un-runnable on a running system and it doesn't make sense to me to
work around that from all the affected places rather than fixing it at
the source especially if that involves breaking a known supported
userland behavior.  This isn't different from the frozen processes
failing to respond to SIGKILL.  I'd be a lot more comfortable stating
that cgroup freezer is currently broken rather than diddling with
subtle ptrace semantics.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to