Re: Possible "struct pid" leak from tty_io.c
On 16/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > It seems to fix the leak. I looked at the logs and proc_set_tty calls > put_pid twice for pid 245 (the unresolved leak) and get_pid for pid > 296, which is later passed to put_pid via do_tty_hangup. Ok. Any chance you could help me track down which application is ultimately calling proc_set_tty (I think it has pid 296 in your case). I'll look at this on Monday since I was trying it on an ARM embedded platform in the office (with a Debian filesystem). -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 14/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> How does this look? > > It seems to fix the leak. I looked at the logs and proc_set_tty calls > put_pid twice for pid 245 (the unresolved leak) and get_pid for pid > 296, which is later passed to put_pid via do_tty_hangup. Ok. Any chance you could help me track down which application is ultimately calling proc_set_tty (I think it has pid 296 in your case). >From what I can tell reading the source it is calling TIOCSCTTY with the arg field set to 1. Which is a linux extension to force other people off of the tty. My skimming of the implementation says to me that we are forcing other processes of the tty in the wrong way. I think we should call tty_vhangup (so those processes kicked off get normal terminal hangup behavior) and not simply session_clear_tty. However if I am correct the implementation has been broken for over a decade so I am reluctant to just change it without tracking down the users. The patch below is what I am looking at for a comprehensive fix. I think it fixes the second set of leaks in a better manner by simply fixing callers to do the sane thing. Eric diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index e453268..5140f15 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -1376,6 +1376,8 @@ static void do_tty_hangup(struct work_struct *work) read_unlock(&tasklist_lock); tty->flags = 0; + put_pid(tty->session); + put_pid(tty->pgrp); tty->session = NULL; tty->pgrp = NULL; tty->ctrl_status = 0; @@ -2972,9 +2974,7 @@ static int tiocsctty(struct tty_struct *tty, int arg) /* * Steal it away */ - read_lock(&tasklist_lock); - session_clear_tty(tty->session); - read_unlock(&tasklist_lock); + tty_vhangup(tty); } else { ret = -EPERM; goto unlock; @@ -3850,7 +3850,7 @@ static struct pid *__proc_set_tty(struct task_struct *tsk, struct tty_struct *tt return old_pgrp; } -void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty) +static void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty) { struct pid *old_pgrp; diff --git a/include/linux/tty.h b/include/linux/tty.h index dee72b9..5f7a5fb 100644 --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -333,7 +333,6 @@ extern int tty_ioctl(struct inode *inode, struct file *file, unsigned int cmd, extern dev_t tty_devnum(struct tty_struct *tty); extern void proc_clear_tty(struct task_struct *p); -extern void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty); extern struct tty_struct *get_current_tty(void); extern struct mutex tty_mutex; diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 19a385e..84b489a 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -1759,12 +1759,16 @@ static inline void flush_unauthorized_files(struct files_struct * files) } file_list_unlock(); - /* Reset controlling tty. */ - if (drop_tty) - proc_set_tty(current, NULL); } mutex_unlock(&tty_mutex); + /* Reset controlling tty. */ + if (drop_tty) { + if (current->signal->leader) + disassociate_ctty(0); + proc_clear_tty(current); + } + /* Revalidate access to inherited open files. */ AVC_AUDIT_DATA_INIT(&ad,FS); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 14/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> How does this look? > > It seems to fix the leak. I looked at the logs and proc_set_tty calls > put_pid twice for pid 245 (the unresolved leak) and get_pid for pid > 296, which is later passed to put_pid via do_tty_hangup. I can see where this would. Now I do have a concern that proc_set_tty. With my current foggy recollections of the semantics of how SIGHUP is sent I think both callers of proc_set_tty are buggy. We steal away the tty and don't send SIGHUP to the old users of the tty. For flush_unauthorized files making that case looks fairly easy. For tiocsctty this it looks more difficult. I need to carefully read through what the rules are again to be certain. There are legitimate cases for not sending SIGHUP. > I still get the "error attempted to write to tty [0x] = NULL" > when debugging is enabled in tty_io.c but it seems harmless. Yes. I think that is the last vestiges of a recent tty layer debugging session. The code is 8 dec 2006, and came in when we started testing for NULL and making a NULL tty there harmless. I remember walking through disassociate_ctty several months ago and not finding any bugs, but I might look again. So anyway I almost have this. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
On 14/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: How does this look? It seems to fix the leak. I looked at the logs and proc_set_tty calls put_pid twice for pid 245 (the unresolved leak) and get_pid for pid 296, which is later passed to put_pid via do_tty_hangup. I still get the "error attempted to write to tty [0x] = NULL" when debugging is enabled in tty_io.c but it seems harmless. I don't have the setup to test this easily, but this bit makes seems to make sense. And I don't have a fully tested kmemleak patch to post either :-). Thanks. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
How does this look? I don't have the setup to test this easily, but this bit makes seems to make sense. I will keep code reviewing and see if I can convince myself that this is correct or incorrect in the mean time... diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index e453268..fc125ac 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -1376,6 +1376,8 @@ static void do_tty_hangup(struct work_struct *work) read_unlock(&tasklist_lock); tty->flags = 0; + put_pid(tty->session); + put_pid(tty->pgrp); tty->session = NULL; tty->pgrp = NULL; tty->ctrl_status = 0; @@ -3841,6 +3843,8 @@ static struct pid *__proc_set_tty(struct task_struct *tsk, struct tty_struct *tt { struct pid *old_pgrp; if (tty) { + put_pid(tty->session); + put_pid(tty->pgrp); tty->session = get_pid(task_session(tsk)); tty->pgrp = get_pid(task_pgrp(tsk)); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
On 13/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > void proc_clear_tty(struct task_struct *p) > { > + struct tty_struct *tty; > + > spin_lock_irq(&p->sighand->siglock); > + tty = p->signal->tty; > + if (tty) { > + put_pid(tty->session); > + put_pid(tty->pgrp); > + } > p->signal->tty = NULL; > spin_unlock_irq(&p->sighand->siglock); > } This patch can't be right. Not the way proc_clear_tty is called once for each process in the session, plus we aren't clearing tty->session and tty->pgrp here. If the above patch works it's a fluke. I looked at the logs and the pointer isn't freed indeed. It is just a false negative in kmemleak and it would appear as a leak at some point. But the previous patch (do_tty_hangup) seems to fix one of the leaks. For the 2nd leak, proc_set_tty is called and, for symmetry, I added put_pid in proc_clear_tty (but without any deep thought). I also haven't checked any lockdep issues with adding put_pid when p->sighand->siglock is held. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> If I can manage to focus on this, it looks like the information I need to >> start fixing this. > > I had a look at the second leak reported it seems to be caused by the > same proc_set_tty() call but, in this case, there is no > disassociate_tty() call for the task (and the patch I posted is not > enough). Maybe something like below (no thourough testing): > > diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c > index db91398..ea6ca7d 100644 > --- a/drivers/char/tty_io.c > +++ b/drivers/char/tty_io.c > @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum); > > void proc_clear_tty(struct task_struct *p) > { > + struct tty_struct *tty; > + > spin_lock_irq(&p->sighand->siglock); > + tty = p->signal->tty; > + if (tty) { > + put_pid(tty->session); > + put_pid(tty->pgrp); > + } > p->signal->tty = NULL; > spin_unlock_irq(&p->sighand->siglock); > } This patch can't be right. Not the way proc_clear_tty is called once for each process in the session, plus we aren't clearing tty->session and tty->pgrp here. If the above patch works it's a fluke. Still it is the right general area of the code. I've just started looking at this it is going to take me a bit to come up to speed on this code again and see what silly thing is missing. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> If I can manage to focus on this, it looks like the information I need to >> start fixing this. > > I had a look at the second leak reported it seems to be caused by the > same proc_set_tty() call but, in this case, there is no > disassociate_tty() call for the task (and the patch I posted is not > enough). Maybe something like below (no thourough testing): Thanks. That doesn't look to bad. I'm just about to look into this. If I'm lucky all I will need to do is a close code review of your patch. Hopefully I can do that this afternoon. > diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c > index db91398..ea6ca7d 100644 > --- a/drivers/char/tty_io.c > +++ b/drivers/char/tty_io.c > @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum); > > void proc_clear_tty(struct task_struct *p) > { > + struct tty_struct *tty; > + > spin_lock_irq(&p->sighand->siglock); > + tty = p->signal->tty; > + if (tty) { > + put_pid(tty->session); > + put_pid(tty->pgrp); > + } > p->signal->tty = NULL; > spin_unlock_irq(&p->sighand->siglock); > } Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: If I can manage to focus on this, it looks like the information I need to start fixing this. I had a look at the second leak reported it seems to be caused by the same proc_set_tty() call but, in this case, there is no disassociate_tty() call for the task (and the patch I posted is not enough). Maybe something like below (no thourough testing): diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index db91398..ea6ca7d 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum); void proc_clear_tty(struct task_struct *p) { + struct tty_struct *tty; + spin_lock_irq(&p->sighand->siglock); + tty = p->signal->tty; + if (tty) { + put_pid(tty->session); + put_pid(tty->pgrp); + } p->signal->tty = NULL; spin_unlock_irq(&p->sighand->siglock); } -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > Eric, > > For a longer explanation, see the second part of this e-mail. In > short, the patch below seems to fix this particular leak. I'm not sure > that's the correct/complete fix as I seem to still get a 2nd report. > Any info is welcomed. Sure. I was starting to suspect that location myself. > diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c > index e453268..4e33dc2 100644 > --- a/drivers/char/tty_io.c > +++ b/drivers/char/tty_io.c > @@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work) > } > read_unlock(&tasklist_lock); > > + put_pid(tty->session); > + put_pid(tty->pgrp); > + > tty->flags = 0; > tty->session = NULL; > tty->pgrp = NULL; > > On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> "Catalin Marinas" <[EMAIL PROTECTED]> writes: >> > The /sbin/init application calls sys_clone() a few times but only one >> > leak is reported (see below). Looking at the reported pid object (at >> > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 >> > exists any more. > [...] >> > unreferenced object 0xc7c14500 (size 36): >> > comm "init", pid 245, jiffies 4294939289 >> > backtrace: >> >[] kmem_cache_alloc >> >[] alloc_pid >> >[] do_fork >> >[] sys_clone >> >[] ret_fast_syscall >> >> I think this is the path that all pid structures come from so >> unfortunately that doesn't help tracing this problem down. > > No, indeed, but that's the only thing kmemleak can report. Anyway, I > got some more information now, after adding several printk's: > > The difference from other pid objects is that this one (with nr 296) > is passed as a parameter to proc_set_tty(). The __proc_set_tty() > function increments the pid->count twice via get_pid(), and, with two > other get_pid calls, the pid->count for this object gets to 5 (1 being > the initial value). The prints below are function name, struct pid > address (different from the runs yesterday though), pid->nr and > pid->count (after get_pid incrementing). It also show the return > address and symbol (the calling function): > > alloc_pid: c7c149d8, 296, 1 > get_pid: c7c149d8, 296, 2 >return: c0122d64 (proc_set_tty+0x34/0x54) > get_pid: c7c149d8, 296, 3 >return: c0122d64 (proc_set_tty+0x34/0x54) > get_pid: c7c149d8, 296, 4 >return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid > in disassociate_ctty but it is reported like this because of get_pid > inlining > get_pid: c7c149d8, 296, 5 >return: c0124a0c (tty_vhangup+0x14/0x18) > > On the exit path (see below), however, put_pid is called twice before > free_pid and once via release_task -> detach_pid -> free_pid -> ... -> > __rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that > free_pid is called with pid->nr == 3 and the last put_pid gets called > with nr == 3 as well (but it decrements it to 2 and that's what I find > at that memory location). In the trace below, the pid->count is > printed before put_pid modifies it: > > put_pid: c7c149d8, 296, 5 >return: c0124b5c (disassociate_ctty+0x14c/0x230) > put_pid: c7c149d8, 296, 4 >return: c0124ba8 (disassociate_ctty+0x198/0x230) > detach_pid: c7c149d8, 296, 3 >return: c002a230 (release_task+0x1c0/0x358) > detach_pid: c7c149d8, 296, 3 >return: c002a248 (release_task+0x1d8/0x358) > detach_pid: c7c149d8, 296, 3 >return: c002a254 (release_task+0x1e4/0x358) > free_pid: c7c149d8, 296, 3 >return: c003a990 (detach_pid+0xac/0xc8) > ... > delayed_put_pid: c7c149d8, 296, 3 >return: c003af68 (__rcu_process_callbacks+0x19c/0x25c) > put_pid: c7c149d8, 296, 3 >return: c003a8cc (delayed_put_pid+0x54/0x6c) > > In the above disassociate_ctty() function the code below (line 1542) > doesn't seem to get called: > > tty = get_current_tty(); > if (tty) { > put_pid(tty->session); > put_pid(tty->pgrp); > tty->session = NULL; > tty->pgrp = NULL; > } else { > > and I get the following error if TTY_DEBUG_HANGUP is defined - "error > attempted to write to tty [0x] = NULL". > > It looks like the tty_vhangup() call in in disassociate_ctty() sets > current->signal->tty to NULL in the do_each_pid_task loop in > do_tty_hangup (p->signal->tty = NULL). The second call to > get_current_tty() in disassociate_ctty() return NULL and therefore no > put_pid on tty->session and tty->pgrp (which are also set to NULL in > the previous function). Thanks. If I can manage to focus on this, it looks like the information I need to start fixing this. Adding the reference counting when we didn't have any before is always interesting. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> "Catalin Marinas" <[EMAIL PROTECTED]> writes: > > I think it's only the pid_chain and rcu member that could be placed in > a list and kmemleak scans the memory for these two offsets as well. > I'll check those lists anyway but I doubt it's a more fundamental > problem with how kmemleak handles struct pid as I should've probably > got more reports. Right. I was pointing out the possibilities but because we do some tricky things. Mostly I was wondering about the hlist for the list of tasks. Now if a task is on that list we should have a struct pid_link pointing at our struct pid, so it shouldn't fool kmemleak but I'm still a little curious if all of those hlist_heads are NULL pointers. Yes, all the 3 hlist_head tasks are NULL pointers on the reported object. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
Eric, For a longer explanation, see the second part of this e-mail. In short, the patch below seems to fix this particular leak. I'm not sure that's the correct/complete fix as I seem to still get a 2nd report. Any info is welcomed. diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index e453268..4e33dc2 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work) } read_unlock(&tasklist_lock); + put_pid(tty->session); + put_pid(tty->pgrp); + tty->flags = 0; tty->session = NULL; tty->pgrp = NULL; On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > The /sbin/init application calls sys_clone() a few times but only one > leak is reported (see below). Looking at the reported pid object (at > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 > exists any more. [...] > unreferenced object 0xc7c14500 (size 36): > comm "init", pid 245, jiffies 4294939289 > backtrace: >[] kmem_cache_alloc >[] alloc_pid >[] do_fork >[] sys_clone >[] ret_fast_syscall I think this is the path that all pid structures come from so unfortunately that doesn't help tracing this problem down. No, indeed, but that's the only thing kmemleak can report. Anyway, I got some more information now, after adding several printk's: The difference from other pid objects is that this one (with nr 296) is passed as a parameter to proc_set_tty(). The __proc_set_tty() function increments the pid->count twice via get_pid(), and, with two other get_pid calls, the pid->count for this object gets to 5 (1 being the initial value). The prints below are function name, struct pid address (different from the runs yesterday though), pid->nr and pid->count (after get_pid incrementing). It also show the return address and symbol (the calling function): alloc_pid: c7c149d8, 296, 1 get_pid: c7c149d8, 296, 2 return: c0122d64 (proc_set_tty+0x34/0x54) get_pid: c7c149d8, 296, 3 return: c0122d64 (proc_set_tty+0x34/0x54) get_pid: c7c149d8, 296, 4 return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid in disassociate_ctty but it is reported like this because of get_pid inlining get_pid: c7c149d8, 296, 5 return: c0124a0c (tty_vhangup+0x14/0x18) On the exit path (see below), however, put_pid is called twice before free_pid and once via release_task -> detach_pid -> free_pid -> ... -> __rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that free_pid is called with pid->nr == 3 and the last put_pid gets called with nr == 3 as well (but it decrements it to 2 and that's what I find at that memory location). In the trace below, the pid->count is printed before put_pid modifies it: put_pid: c7c149d8, 296, 5 return: c0124b5c (disassociate_ctty+0x14c/0x230) put_pid: c7c149d8, 296, 4 return: c0124ba8 (disassociate_ctty+0x198/0x230) detach_pid: c7c149d8, 296, 3 return: c002a230 (release_task+0x1c0/0x358) detach_pid: c7c149d8, 296, 3 return: c002a248 (release_task+0x1d8/0x358) detach_pid: c7c149d8, 296, 3 return: c002a254 (release_task+0x1e4/0x358) free_pid: c7c149d8, 296, 3 return: c003a990 (detach_pid+0xac/0xc8) ... delayed_put_pid: c7c149d8, 296, 3 return: c003af68 (__rcu_process_callbacks+0x19c/0x25c) put_pid: c7c149d8, 296, 3 return: c003a8cc (delayed_put_pid+0x54/0x6c) In the above disassociate_ctty() function the code below (line 1542) doesn't seem to get called: tty = get_current_tty(); if (tty) { put_pid(tty->session); put_pid(tty->pgrp); tty->session = NULL; tty->pgrp = NULL; } else { and I get the following error if TTY_DEBUG_HANGUP is defined - "error attempted to write to tty [0x] = NULL". It looks like the tty_vhangup() call in in disassociate_ctty() sets current->signal->tty to NULL in the do_each_pid_task loop in do_tty_hangup (p->signal->tty = NULL). The second call to get_current_tty() in disassociate_ctty() return NULL and therefore no put_pid on tty->session and tty->pgrp (which are also set to NULL in the previous function). Regards. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> "Catalin Marinas" <[EMAIL PROTECTED]> writes: > > I think it's only the pid_chain and rcu member that could be placed in > a list and kmemleak scans the memory for these two offsets as well. > I'll check those lists anyway but I doubt it's a more fundamental > problem with how kmemleak handles struct pid as I should've probably > got more reports. Right. I was pointing out the possibilities but because we do some tricky things. Mostly I was wondering about the hlist for the list of tasks. Now if a task is on that list we should have a struct pid_link pointing at our struct pid, so it shouldn't fool kmemleak but I'm still a little curious if all of those hlist_heads are NULL pointers. >> In most any other layer we cache pids indefinitely and a situation >> where we have a pointer to a struct pid with a ref count of 1 long >> after the process goes away is expected. > > Yes, indeed, but what kmemleak reports is that the pid structure > wasn't freed yet and there is no way to determine its pointer directly > or via container_of on members (by scanning the memory), hence it is > considered a leak. Yes that sounds like a leak. >> I don't understand your situation enough to guess what is going wrong >> yet. Hopefully I have given you enough information to get started. > > Yes, many thanks. I'll dig further and let you know. Thanks Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: "Catalin Marinas" <[EMAIL PROTECTED]> writes: > I'm trying to track down a kmemleak report (on an ARM platform) which > seems to have appeared with commit > ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the > TTY layer at all, is it possible that the above commit missed a > put_pid() call on some path? I won't arbitrarily rule a missing put_pid out. I have been know to goof up upon occasion. I'm not entirely sure it's this part of the code, I would have to do some more investigations (I didn't get this leak before). An "unscientific" test shows that if I define get_pid/put_pid in the tty_io.c file so that pid->count is not affected, the leak disappears. This doesn't necessarily prove that the fault is here though. I just did a quick look to see what kmemleak is. A conservative tracing leak detector sounds interesting. Except for all of the list heads which lead to container_of calls I don't know of anything in the struct pid implementation that would be difficult for it to work with. Well that and there is some rcu access protection which can delay the free by a bit. Kmemleak can cope with list heads and rcu delayed freeing as it also checks for pointer aliases (those accessible via container_of). > The /sbin/init application calls sys_clone() a few times but only one > leak is reported (see below). Looking at the reported pid object (at > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 > exists any more. It could still be a valid session or a process group id. If you examine the struct pid you can test for this be examining all of the list heads it keeps. If there is something on any of the lists that would account a count of 1. How we have a count of 2 I don't have enough information to guess. I think it's only the pid_chain and rcu member that could be placed in a list and kmemleak scans the memory for these two offsets as well. I'll check those lists anyway but I doubt it's a more fundamental problem with how kmemleak handles struct pid as I should've probably got more reports. In most any other layer we cache pids indefinitely and a situation where we have a pointer to a struct pid with a ref count of 1 long after the process goes away is expected. Yes, indeed, but what kmemleak reports is that the pid structure wasn't freed yet and there is no way to determine its pointer directly or via container_of on members (by scanning the memory), hence it is considered a leak. I don't understand your situation enough to guess what is going wrong yet. Hopefully I have given you enough information to get started. Yes, many thanks. I'll dig further and let you know. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible "struct pid" leak from tty_io.c
"Catalin Marinas" <[EMAIL PROTECTED]> writes: > Hi Eric, > > I'm trying to track down a kmemleak report (on an ARM platform) which > seems to have appeared with commit > ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the > TTY layer at all, is it possible that the above commit missed a > put_pid() call on some path? I won't arbitrarily rule a missing put_pid out. I have been know to goof up upon occasion. I just did a quick look to see what kmemleak is. A conservative tracing leak detector sounds interesting. Except for all of the list heads which lead to container_of calls I don't know of anything in the struct pid implementation that would be difficult for it to work with. Well that and there is some rcu access protection which can delay the free by a bit. > The /sbin/init application calls sys_clone() a few times but only one > leak is reported (see below). Looking at the reported pid object (at > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 > exists any more. It could still be a valid session or a process group id. If you examine the struct pid you can test for this be examining all of the list heads it keeps. If there is something on any of the lists that would account a count of 1. How we have a count of 2 I don't have enough information to guess. Core tty layer handling stops having an remembering pids when the session or process group exits so it is relatively safe from pid wrap around issues without my changes and should make the kind of thing you are reporting very unlikely in a correctly functioning system. In most any other layer we cache pids indefinitely and a situation where we have a pointer to a struct pid with a ref count of 1 long after the process goes away is expected. In this case it is better to hold a reference to a struct pid so we don't do the wrong thing when pid wrap around occurs then to hold a reference to an entire task_struct and lock that in place. I don't understand your situation enough to guess what is going wrong yet. Hopefully I have given you enough information to get started. > unreferenced object 0xc7c14500 (size 36): > comm "init", pid 245, jiffies 4294939289 > backtrace: >[] kmem_cache_alloc >[] alloc_pid >[] do_fork >[] sys_clone >[] ret_fast_syscall I think this is the path that all pid structures come from so unfortunately that doesn't help tracing this problem down. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Possible "struct pid" leak from tty_io.c
Hi Eric, I'm trying to track down a kmemleak report (on an ARM platform) which seems to have appeared with commit ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the TTY layer at all, is it possible that the above commit missed a put_pid() call on some path? The /sbin/init application calls sys_clone() a few times but only one leak is reported (see below). Looking at the reported pid object (at 0xc7c14500), count is 2 and nr is 296 but no process with pid 296 exists any more. unreferenced object 0xc7c14500 (size 36): comm "init", pid 245, jiffies 4294939289 backtrace: [] kmem_cache_alloc [] alloc_pid [] do_fork [] sys_clone [] ret_fast_syscall Thanks. -- Catalin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/