[PATCH 0/4] utrace for 3.0 kernel

2011-06-20 Thread Oleg Nesterov
Hello.

Utrace patches for 3.0 kernel

0001-ptrace-temporary-revert-the-recent-ptrace-jobctl-re.patch
0002-tracehooks-preparation-for-ptrace-utrace.patch
0003-utrace-core.patch
0004-implement-utrace-ptrace.patch

also available in the following git branch

git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc.git utrace-3.0

Note: 1/4 is the temporary hack to keep utrace working, we need to
completely rework ptrace/utrace interaction.

Oleg.



[PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Oleg Nesterov
Temporary revert the following patches to keep utrace/utrace-ptrace working:

40ae717d1e78d982bd469b2013a4cbf4ec1ca434
ptrace: fix signal-wait_chldexit usage in 
task_clear_group_stop_trapping()

321fb561971ba0f10ce18c0f8a4b9fbfc7cef4b9
ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/

ee77f075921730b2b465880f9fd4367003bdab39
signal: Turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED

780006eac2fe7f4d2582da16a096e5a44c4767ff
signal: do_signal_stop: Remove the unneeded 
task_clear_group_stop_pending()

244056f9dbbc6dc4126a301c745fa3dd67d8af3c
job control: Don't send duplicate job control stop notification while 
ptraced

ceb6bd67f9b9db765e1c29405f26e8460391badd
job control: Notify the real parent of job control events regardless of 
ptrace

62bcf9d992ecc19ea4f37ff57ee0be3e843e
job control: Job control stop notifications should always go to the 
real parent

75b95953a56969a990e6ce154b260be83818fe71
job control: Add @for_ptrace to do_notify_parent_cldstop()

45cb24a1da53beb70f09efccc0373f6a47a9efe0
job control: Allow access to job control events through ptracees

9b84cca2564b9a5b2d064fb44d2a55a5b44473a0
job control: Fix ptracer wait(2) hang and explain notask_error clearing

408a37de6c95832a4880a88a742f89f0cc554d06
job control: Don't set group_stop exit_code if re-entering job control 
stop

0e9f0a4abfd80f8adca624538d479d95159b16d8
ptrace: Always put ptracee into appropriate execution state

e3bd058f62896ec7a2c605ed62a0a811e9147947
ptrace: Collapse ptrace_untrace() into __ptrace_unlink()

d79fdd6d96f46fabb779d86332e3677c6f5c2a4f
ptrace: Clean transitions between TASK_STOPPED and TRACED

5224fa3660ad3881d2f2ad726d22614117963f10
ptrace: Make do_signal_stop() use ptrace_stop() if the task is being 
ptraced

0ae8ce1c8c5b9007ce6bfc83ec2aa0dfce5bbed3
ptrace: Participate in group stop from ptrace_stop() iff the task is 
trapping for group stop

39efa3ef3a376a4e53de2f82fc91182459d34200
signal: Use GROUP_STOP_PENDING to stop once for a single group stop

e5c1902e9260a0075ea52cb5ef627a8d9aaede89
signal: Fix premature completion of group stop when interfered by ptrace

fe1bc6a0954611b806f9e158eb0817cf8ba21660
ptrace: Add @why to ptrace_stop()

edf2ed153bcae52de70db00a98b0e81a5668e563
ptrace: Kill tracehook_notify_jctl()

This obviously reverts some user-visible fixes, but the fixed problems
are very old and minor, they were never reported. In the long term we
need another solution.

Signed-off-by: Oleg Nesterov o...@redhat.com
---
 fs/exec.c |1 -
 include/linux/sched.h |   17 +--
 include/linux/tracehook.h |   27 
 kernel/exit.c |   77 ++-
 kernel/ptrace.c   |  116 +---
 kernel/signal.c   |  339 +
 6 files changed, 148 insertions(+), 429 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 6075a1e..82b5379 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1769,7 +1769,6 @@ static int zap_process(struct task_struct *start, int 
exit_code)
 
t = start;
do {
-   task_clear_group_stop_pending(t);
if (t != current  t-mm) {
sigaddset(t-pending.signal, SIGKILL);
signal_wake_up(t, 1);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a837b20..6c42e24 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -663,8 +663,9 @@ struct signal_struct {
  * Bits in flags field of signal_struct.
  */
 #define SIGNAL_STOP_STOPPED0x0001 /* job control stop in effect */
-#define SIGNAL_STOP_CONTINUED  0x0002 /* SIGCONT since WCONTINUED reap */
-#define SIGNAL_GROUP_EXIT  0x0004 /* group exit in progress */
+#define SIGNAL_STOP_DEQUEUED   0x0002 /* stop signal dequeued */
+#define SIGNAL_STOP_CONTINUED  0x0004 /* SIGCONT since WCONTINUED reap */
+#define SIGNAL_GROUP_EXIT  0x0008 /* group exit in progress */
 /*
  * Pending notifications to parent.
  */
@@ -1283,7 +1284,6 @@ struct task_struct {
int exit_state;
int exit_code, exit_signal;
int pdeath_signal;  /*  The signal sent when the parent dies  */
-   unsigned int group_stop;/* GROUP_STOP_*, siglock protected */
/* ??? */
unsigned int personality;
unsigned did_exec:1;
@@ -1803,17 +1803,6 @@ extern void thread_group_times(struct task_struct *p, 
cputime_t *ut, cputime_t *
 #define tsk_used_math(p) ((p)-flags  PF_USED_MATH)
 #define used_math() tsk_used_math(current)
 
-/*
- * task-group_stop flags
- */
-#define GROUP_STOP_SIGMASK 0x/* signr of the last group stop */
-#define GROUP_STOP_PENDING (1  16) /* task should 

[PATCH 4/4] implement utrace-ptrace

2011-06-20 Thread Oleg Nesterov
The patch adds the new file, kernel/ptrace-utrace.c, which contains
the new implementation of ptrace over utrace.

It's supposed to be an invisible implementation change, nothing should
change to userland when CONFIG_UTRACE is enabled.

Signed-off-by: Roland McGrath rol...@redhat.com
Signed-off-by: Oleg Nesterov o...@redhat.com
---
 include/linux/ptrace.h |3 +
 kernel/Makefile|1 +
 kernel/ptrace-utrace.c | 1173 
 kernel/ptrace.c|  638 +-
 kernel/utrace.c|   16 +
 5 files changed, 1513 insertions(+), 318 deletions(-)
 create mode 100644 kernel/ptrace-utrace.c

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 446ed8f..5b0562e 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -101,6 +101,9 @@
 
 extern bool __ptrace_detach(struct task_struct *tracer,
struct task_struct *tracee);
+extern int ptrace_traceme(void);
+extern int ptrace_attach(struct task_struct *tsk);
+extern void ptrace_notify_stop(struct task_struct *tracee);
 
 extern long arch_ptrace(struct task_struct *child, long request,
unsigned long addr, unsigned long data);
diff --git a/kernel/Makefile b/kernel/Makefile
index 4a22e81..5c280dc 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -69,6 +69,7 @@ obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o
 obj-$(CONFIG_SMP) += stop_machine.o
 obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_UTRACE) += utrace.o
+obj-$(CONFIG_UTRACE) += ptrace-utrace.o
 obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
 obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
 obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o
diff --git a/kernel/ptrace-utrace.c b/kernel/ptrace-utrace.c
new file mode 100644
index 000..7a9b396
--- /dev/null
+++ b/kernel/ptrace-utrace.c
@@ -0,0 +1,1173 @@
+/*
+ * linux/kernel/ptrace.c
+ *
+ * (C) Copyright 1999 Linus Torvalds
+ *
+ * Common interfaces for ptrace() which we do not want
+ * to continually duplicate across every architecture.
+ */
+
+#include linux/capability.h
+#include linux/module.h
+#include linux/sched.h
+#include linux/errno.h
+#include linux/mm.h
+#include linux/highmem.h
+#include linux/pagemap.h
+#include linux/ptrace.h
+#include linux/utrace.h
+#include linux/security.h
+#include linux/signal.h
+#include linux/audit.h
+#include linux/pid_namespace.h
+#include linux/syscalls.h
+#include linux/uaccess.h
+
+/*
+ * unptrace a task: move it back to its original parent and
+ * remove it from the ptrace list.
+ *
+ * Must be called with the tasklist lock write-held.
+ */
+void __ptrace_unlink(struct task_struct *child)
+{
+   BUG_ON(!child-ptrace);
+
+   child-ptrace = 0;
+   child-parent = child-real_parent;
+   list_del_init(child-ptrace_entry);
+}
+
+struct ptrace_context {
+   int options;
+
+   int signr;
+   siginfo_t   *siginfo;
+
+   int stop_code;
+   unsigned long   eventmsg;
+
+   enum utrace_resume_action   resume;
+};
+
+#define PT_UTRACED 0x1000
+
+#define PTRACE_O_SYSEMU0x100
+#define PTRACE_O_DETACHED  0x200
+
+#define PTRACE_EVENT_SYSCALL   (1  16)
+#define PTRACE_EVENT_SIGTRAP   (2  16)
+#define PTRACE_EVENT_SIGNAL(3  16)
+/* events visible to user-space */
+#define PTRACE_EVENT_MASK  0x
+
+static inline bool ptrace_event_pending(struct ptrace_context *ctx)
+{
+   return ctx-stop_code != 0;
+}
+
+static inline int get_stop_event(struct ptrace_context *ctx)
+{
+   return ctx-stop_code  8;
+}
+
+static inline void set_stop_code(struct ptrace_context *ctx, int event)
+{
+   ctx-stop_code = (event  8) | SIGTRAP;
+}
+
+static inline struct ptrace_context *
+ptrace_context(struct utrace_engine *engine)
+{
+   return engine-data;
+}
+
+static const struct utrace_engine_ops ptrace_utrace_ops; /* forward decl */
+
+static struct utrace_engine *ptrace_lookup_engine(struct task_struct *tracee)
+{
+   return utrace_attach_task(tracee, UTRACE_ATTACH_MATCH_OPS,
+   ptrace_utrace_ops, NULL);
+}
+
+static int utrace_barrier_uninterruptible(struct task_struct *target,
+   struct utrace_engine *engine)
+{
+   for (;;) {
+   int err = utrace_barrier(target, engine);
+
+   if (err != -ERESTARTSYS)
+   return err;
+
+   schedule_timeout_uninterruptible(1);
+   }
+}
+
+static struct utrace_engine *
+ptrace_reuse_engine(struct task_struct *tracee)
+{
+   struct utrace_engine *engine;
+   struct ptrace_context *ctx;
+   int err = -EPERM;
+
+   engine = ptrace_lookup_engine(tracee);
+   if (IS_ERR(engine))
+   return engine;
+
+   ctx = 

Re: [PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Oleg Nesterov
On 06/20, Kyle McMartin wrote:

 On Mon, Jun 20, 2011 at 06:07:44PM +0200, Oleg Nesterov wrote:
  Temporary revert the following patches to keep utrace/utrace-ptrace working:
 
  huge list of patches here
 
  This obviously reverts some user-visible fixes, but the fixed problems
  are very old and minor, they were never reported. In the long term we
  need another solution.
 

 Dude, that's just not acceptable, that's way too much offset to deal with
 against upstream,

Yes, this reverts 20 patches. But they only touch the ptrace/stop paths,
there won't be more changes in this area until 3.1.

 especially since it's looking like uprobes will get
 merged in 3.1...

Probably yes.

In any case, this series should be dropped when fedora switches to 3.1.
I'll try to do something more clever for 3.1 if utrace is still needed.
Until then we need something for systemtap...

Oleg.



Re: [PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Dave Jones
On Mon, Jun 20, 2011 at 12:16:34PM -0400, Kyle McMartin wrote:
  On Mon, Jun 20, 2011 at 06:07:44PM +0200, Oleg Nesterov wrote:
   Temporary revert the following patches to keep utrace/utrace-ptrace 
   working:
   
   huge list of patches here
   
   This obviously reverts some user-visible fixes, but the fixed problems
   are very old and minor, they were never reported. In the long term we
   need another solution.
  
  Dude, that's just not acceptable, that's way too much offset to deal with
  against upstream, especially since it's looking like uprobes will get
  merged in 3.1... (at least, a lot of the comments seem to have been
  well-addressed on linux-mm.)

I have still yet to see a justification why we want to continue carrying utrace
in Fedora at all. And We want it in RHEL isn't a good enough answer.

It's been FIVE years that we carried that thing without it getting upstream.

What benefit is there in continuing to carry this thing at all ?
Utrace has been an absolute disaster from a merging standpoint.
Even Xen didn't take this long to get upstream.

Dave



Re: [PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Josh Stone
On 06/20/2011 09:44 AM, Dave Jones wrote:
 What benefit is there in continuing to carry this thing at all ?
 Utrace has been an absolute disaster from a merging standpoint.
 Even Xen didn't take this long to get upstream.

I can't dispute the upstream disappointment, but the obvious benefit is
enabling uprobes for systemtap.  There are a growing number of packages
building in markers with systemtap-sdt-devel for debugging and tracing,
so they will expect a way to hook into these.  Yes, the impending
inode-uprobes will be sufficient for this case, but it's a step
backwards in other respects as well.

Josh



Re: [PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Dave Jones
On Mon, Jun 20, 2011 at 10:18:26AM -0700, Josh Stone wrote:
  On 06/20/2011 09:44 AM, Dave Jones wrote:
   What benefit is there in continuing to carry this thing at all ?
   Utrace has been an absolute disaster from a merging standpoint.
   Even Xen didn't take this long to get upstream.
  
  I can't dispute the upstream disappointment, but the obvious benefit is
  enabling uprobes for systemtap.  There are a growing number of packages
  building in markers with systemtap-sdt-devel for debugging and tracing,
  so they will expect a way to hook into these.  Yes, the impending
  inode-uprobes will be sufficient for this case, but it's a step
  backwards in other respects as well.

I'm sure both the Fedora systemtap users will be bummed if it stops working,
but the truth is outside of RHEL, and the people who actually work on
systemtap, afaics, no-one gives a damn.

Dave



Re: [PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Josh Stone
On 06/20/2011 10:28 AM, Dave Jones wrote:
 On Mon, Jun 20, 2011 at 10:18:26AM -0700, Josh Stone wrote:
   On 06/20/2011 09:44 AM, Dave Jones wrote:
What benefit is there in continuing to carry this thing at all ?
Utrace has been an absolute disaster from a merging standpoint.
Even Xen didn't take this long to get upstream.
   
   I can't dispute the upstream disappointment, but the obvious benefit is
   enabling uprobes for systemtap.  There are a growing number of packages
   building in markers with systemtap-sdt-devel for debugging and tracing,
   so they will expect a way to hook into these.  Yes, the impending
   inode-uprobes will be sufficient for this case, but it's a step
   backwards in other respects as well.
 
 I'm sure both the Fedora systemtap users will be bummed if it stops working,
 but the truth is outside of RHEL, and the people who actually work on
 systemtap, afaics, no-one gives a damn.

Packagers are adding these markers of their own accord, and in most
cases are getting them upstream as well.  It is only kernel developers
who are so hostile/apathetic/etc.

Josh



Re: [PATCH 1/4] ptrace: temporary revert the recent ptrace/jobctl rework

2011-06-20 Thread Matthew Garrett
On Mon, Jun 20, 2011 at 10:43:55AM -0700, Josh Stone wrote:

 Packagers are adding these markers of their own accord, and in most
 cases are getting them upstream as well.  It is only kernel developers
 who are so hostile/apathetic/etc.

We only deviate from the upstream kernel to fix bugs, backport features 
or add code that has a clear path to upstream. We do not deviate from 
the upstream kernel to revert a bunch of upstream fixes in order to add 
a feature that's been there for half a decade and still isn't upstream, 
especially when there's been approximately zero user demand for it to 
appear in Fedora. That's a practical attitude, not a hostile or 
apathetic one.

-- 
Matthew Garrett | mj...@srcf.ucam.org