Re: Possible "struct pid" leak from tty_io.c

2007-03-16 Thread Catalin Marinas

On 16/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Catalin Marinas" <[EMAIL PROTECTED]> writes:
> It seems to fix the leak. I looked at the logs and proc_set_tty calls
> put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
> 296, which is later passed to put_pid via do_tty_hangup.

Ok.  Any chance you could help me track down which application is
ultimately calling proc_set_tty (I think it has pid 296 in your case).


I'll look at this on Monday since I was trying it on an ARM embedded
platform in the office (with a Debian filesystem).

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-16 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> On 14/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> How does this look?
>
> It seems to fix the leak. I looked at the logs and proc_set_tty calls
> put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
> 296, which is later passed to put_pid via do_tty_hangup.

Ok.  Any chance you could help me track down which application is
ultimately calling proc_set_tty (I think it has pid 296 in your case).

>From what I can tell reading the source it is calling
TIOCSCTTY with the arg field set to 1.  Which is a linux extension
to force other people off of the tty.

My skimming of the implementation says to me that we are forcing
other processes of the tty in the wrong way.  I think we should call
tty_vhangup (so those processes kicked off get normal terminal hangup
behavior) and not simply session_clear_tty.  However if I am correct the
implementation has been broken for over a decade so I am reluctant to
just change it without tracking down the users.  

The patch below is what I am looking at for a comprehensive fix.  I
think it fixes the second set of leaks in a better manner by simply
fixing callers to do the sane thing.

Eric


diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..5140f15 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1376,6 +1376,8 @@ static void do_tty_hangup(struct work_struct *work)
read_unlock(_lock);
 
tty->flags = 0;
+   put_pid(tty->session);
+   put_pid(tty->pgrp);
tty->session = NULL;
tty->pgrp = NULL;
tty->ctrl_status = 0;
@@ -2972,9 +2974,7 @@ static int tiocsctty(struct tty_struct *tty, int arg)
/*
 * Steal it away
 */
-   read_lock(_lock);
-   session_clear_tty(tty->session);
-   read_unlock(_lock);
+   tty_vhangup(tty);
} else {
ret = -EPERM;
goto unlock;
@@ -3850,7 +3850,7 @@ static struct pid *__proc_set_tty(struct task_struct 
*tsk, struct tty_struct *tt
return old_pgrp;
 }
 
-void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty)
+static void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty)
 {
struct pid *old_pgrp;
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index dee72b9..5f7a5fb 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -333,7 +333,6 @@ extern int tty_ioctl(struct inode *inode, struct file 
*file, unsigned int cmd,
 
 extern dev_t tty_devnum(struct tty_struct *tty);
 extern void proc_clear_tty(struct task_struct *p);
-extern void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty);
 extern struct tty_struct *get_current_tty(void);
 
 extern struct mutex tty_mutex;
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 19a385e..84b489a 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1759,12 +1759,16 @@ static inline void flush_unauthorized_files(struct 
files_struct * files)
}
file_list_unlock();
 
-   /* Reset controlling tty. */
-   if (drop_tty)
-   proc_set_tty(current, NULL);
}
mutex_unlock(_mutex);
 
+   /* Reset controlling tty. */
+   if (drop_tty) {
+   if (current->signal->leader)
+   disassociate_ctty(0);
+   proc_clear_tty(current);
+   }
+
/* Revalidate access to inherited open files. */
 
AVC_AUDIT_DATA_INIT(,FS);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-16 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 On 14/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 How does this look?

 It seems to fix the leak. I looked at the logs and proc_set_tty calls
 put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
 296, which is later passed to put_pid via do_tty_hangup.

Ok.  Any chance you could help me track down which application is
ultimately calling proc_set_tty (I think it has pid 296 in your case).

From what I can tell reading the source it is calling
TIOCSCTTY with the arg field set to 1.  Which is a linux extension
to force other people off of the tty.

My skimming of the implementation says to me that we are forcing
other processes of the tty in the wrong way.  I think we should call
tty_vhangup (so those processes kicked off get normal terminal hangup
behavior) and not simply session_clear_tty.  However if I am correct the
implementation has been broken for over a decade so I am reluctant to
just change it without tracking down the users.  

The patch below is what I am looking at for a comprehensive fix.  I
think it fixes the second set of leaks in a better manner by simply
fixing callers to do the sane thing.

Eric


diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..5140f15 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1376,6 +1376,8 @@ static void do_tty_hangup(struct work_struct *work)
read_unlock(tasklist_lock);
 
tty-flags = 0;
+   put_pid(tty-session);
+   put_pid(tty-pgrp);
tty-session = NULL;
tty-pgrp = NULL;
tty-ctrl_status = 0;
@@ -2972,9 +2974,7 @@ static int tiocsctty(struct tty_struct *tty, int arg)
/*
 * Steal it away
 */
-   read_lock(tasklist_lock);
-   session_clear_tty(tty-session);
-   read_unlock(tasklist_lock);
+   tty_vhangup(tty);
} else {
ret = -EPERM;
goto unlock;
@@ -3850,7 +3850,7 @@ static struct pid *__proc_set_tty(struct task_struct 
*tsk, struct tty_struct *tt
return old_pgrp;
 }
 
-void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty)
+static void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty)
 {
struct pid *old_pgrp;
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index dee72b9..5f7a5fb 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -333,7 +333,6 @@ extern int tty_ioctl(struct inode *inode, struct file 
*file, unsigned int cmd,
 
 extern dev_t tty_devnum(struct tty_struct *tty);
 extern void proc_clear_tty(struct task_struct *p);
-extern void proc_set_tty(struct task_struct *tsk, struct tty_struct *tty);
 extern struct tty_struct *get_current_tty(void);
 
 extern struct mutex tty_mutex;
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 19a385e..84b489a 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1759,12 +1759,16 @@ static inline void flush_unauthorized_files(struct 
files_struct * files)
}
file_list_unlock();
 
-   /* Reset controlling tty. */
-   if (drop_tty)
-   proc_set_tty(current, NULL);
}
mutex_unlock(tty_mutex);
 
+   /* Reset controlling tty. */
+   if (drop_tty) {
+   if (current-signal-leader)
+   disassociate_ctty(0);
+   proc_clear_tty(current);
+   }
+
/* Revalidate access to inherited open files. */
 
AVC_AUDIT_DATA_INIT(ad,FS);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-16 Thread Catalin Marinas

On 16/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Catalin Marinas [EMAIL PROTECTED] writes:
 It seems to fix the leak. I looked at the logs and proc_set_tty calls
 put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
 296, which is later passed to put_pid via do_tty_hangup.

Ok.  Any chance you could help me track down which application is
ultimately calling proc_set_tty (I think it has pid 296 in your case).


I'll look at this on Monday since I was trying it on an ARM embedded
platform in the office (with a Debian filesystem).

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-15 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> On 14/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> How does this look?
>
> It seems to fix the leak. I looked at the logs and proc_set_tty calls
> put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
> 296, which is later passed to put_pid via do_tty_hangup.

I can see where this would.  Now I do have a concern that proc_set_tty.
With my current foggy recollections of the semantics of how SIGHUP is
sent I think both callers of proc_set_tty are buggy.  We steal away
the tty and don't send SIGHUP to the old users of the tty.

For flush_unauthorized files making that case looks fairly easy.
For tiocsctty this it looks more difficult.

I need to carefully read through what the rules are again to be certain.
There are legitimate cases for not sending SIGHUP.

> I still get the "error attempted to write to tty [0x] = NULL"
> when debugging is enabled in tty_io.c but it seems harmless.

Yes.  I think that is the last vestiges of a recent tty layer debugging
session.  The code is 8 dec 2006, and came in when we started testing
for NULL and making a NULL tty there harmless.

I remember walking through disassociate_ctty several months ago and
not finding any bugs, but I might look again.

So anyway I almost have this.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-15 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 On 14/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 How does this look?

 It seems to fix the leak. I looked at the logs and proc_set_tty calls
 put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
 296, which is later passed to put_pid via do_tty_hangup.

I can see where this would.  Now I do have a concern that proc_set_tty.
With my current foggy recollections of the semantics of how SIGHUP is
sent I think both callers of proc_set_tty are buggy.  We steal away
the tty and don't send SIGHUP to the old users of the tty.

For flush_unauthorized files making that case looks fairly easy.
For tiocsctty this it looks more difficult.

I need to carefully read through what the rules are again to be certain.
There are legitimate cases for not sending SIGHUP.

 I still get the error attempted to write to tty [0x] = NULL
 when debugging is enabled in tty_io.c but it seems harmless.

Yes.  I think that is the last vestiges of a recent tty layer debugging
session.  The code is 8 dec 2006, and came in when we started testing
for NULL and making a NULL tty there harmless.

I remember walking through disassociate_ctty several months ago and
not finding any bugs, but I might look again.

So anyway I almost have this.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-14 Thread Catalin Marinas

On 14/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

How does this look?


It seems to fix the leak. I looked at the logs and proc_set_tty calls
put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
296, which is later passed to put_pid via do_tty_hangup.

I still get the "error attempted to write to tty [0x] = NULL"
when debugging is enabled in tty_io.c but it seems harmless.


I don't have the setup to test this easily, but this bit makes seems to
make sense.


And I don't have a fully tested kmemleak patch to post either :-).

Thanks.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-14 Thread Eric W. Biederman

How does this look?

I don't have the setup to test this easily, but this bit makes seems to make
sense.  I will keep code reviewing and see if I can convince myself that this
is correct or incorrect in the mean time...

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..fc125ac 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1376,6 +1376,8 @@ static void do_tty_hangup(struct work_struct *work)
read_unlock(_lock);
 
tty->flags = 0;
+   put_pid(tty->session);
+   put_pid(tty->pgrp);
tty->session = NULL;
tty->pgrp = NULL;
tty->ctrl_status = 0;
@@ -3841,6 +3843,8 @@ static struct pid *__proc_set_tty(struct task_struct 
*tsk, struct tty_struct *tt
 {
struct pid *old_pgrp;
if (tty) {
+   put_pid(tty->session);
+   put_pid(tty->pgrp);
tty->session = get_pid(task_session(tsk));
tty->pgrp = get_pid(task_pgrp(tsk));
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-14 Thread Catalin Marinas

On 13/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Catalin Marinas" <[EMAIL PROTECTED]> writes:
> void proc_clear_tty(struct task_struct *p)
> {
> + struct tty_struct *tty;
> +
>   spin_lock_irq(>sighand->siglock);
> + tty = p->signal->tty;
> + if (tty) {
> + put_pid(tty->session);
> + put_pid(tty->pgrp);
> + }
>   p->signal->tty = NULL;
>   spin_unlock_irq(>sighand->siglock);
> }

This patch can't be right.  Not the way proc_clear_tty is called
once for each process in the session, plus we aren't clearing
tty->session and tty->pgrp here.

If the above patch works it's a fluke.


I looked at the logs and the pointer isn't freed indeed. It is just a
false negative in kmemleak and it would appear as a leak at some
point. But the previous patch (do_tty_hangup) seems to fix one of the
leaks.

For the 2nd leak, proc_set_tty is called and, for symmetry, I added
put_pid in proc_clear_tty (but without any deep thought). I also
haven't checked any lockdep issues with adding put_pid when
p->sighand->siglock is held.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-14 Thread Catalin Marinas

On 13/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Catalin Marinas [EMAIL PROTECTED] writes:
 void proc_clear_tty(struct task_struct *p)
 {
 + struct tty_struct *tty;
 +
   spin_lock_irq(p-sighand-siglock);
 + tty = p-signal-tty;
 + if (tty) {
 + put_pid(tty-session);
 + put_pid(tty-pgrp);
 + }
   p-signal-tty = NULL;
   spin_unlock_irq(p-sighand-siglock);
 }

This patch can't be right.  Not the way proc_clear_tty is called
once for each process in the session, plus we aren't clearing
tty-session and tty-pgrp here.

If the above patch works it's a fluke.


I looked at the logs and the pointer isn't freed indeed. It is just a
false negative in kmemleak and it would appear as a leak at some
point. But the previous patch (do_tty_hangup) seems to fix one of the
leaks.

For the 2nd leak, proc_set_tty is called and, for symmetry, I added
put_pid in proc_clear_tty (but without any deep thought). I also
haven't checked any lockdep issues with adding put_pid when
p-sighand-siglock is held.

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-14 Thread Eric W. Biederman

How does this look?

I don't have the setup to test this easily, but this bit makes seems to make
sense.  I will keep code reviewing and see if I can convince myself that this
is correct or incorrect in the mean time...

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..fc125ac 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1376,6 +1376,8 @@ static void do_tty_hangup(struct work_struct *work)
read_unlock(tasklist_lock);
 
tty-flags = 0;
+   put_pid(tty-session);
+   put_pid(tty-pgrp);
tty-session = NULL;
tty-pgrp = NULL;
tty-ctrl_status = 0;
@@ -3841,6 +3843,8 @@ static struct pid *__proc_set_tty(struct task_struct 
*tsk, struct tty_struct *tt
 {
struct pid *old_pgrp;
if (tty) {
+   put_pid(tty-session);
+   put_pid(tty-pgrp);
tty-session = get_pid(task_session(tsk));
tty-pgrp = get_pid(task_pgrp(tsk));
}
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-14 Thread Catalin Marinas

On 14/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

How does this look?


It seems to fix the leak. I looked at the logs and proc_set_tty calls
put_pid twice for pid 245 (the unresolved leak) and get_pid for pid
296, which is later passed to put_pid via do_tty_hangup.

I still get the error attempted to write to tty [0x] = NULL
when debugging is enabled in tty_io.c but it seems harmless.


I don't have the setup to test this easily, but this bit makes seems to
make sense.


And I don't have a fully tested kmemleak patch to post either :-).

Thanks.

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-13 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> If I can manage to focus on this, it looks like the information I need to
>> start fixing this.
>
> I had a look at the second leak reported it seems to be caused by the
> same proc_set_tty() call but, in this case, there is no
> disassociate_tty() call for the task (and the patch I posted is not
> enough). Maybe something like below (no thourough testing):
>
> diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
> index db91398..ea6ca7d 100644
> --- a/drivers/char/tty_io.c
> +++ b/drivers/char/tty_io.c
> @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum);
>
> void proc_clear_tty(struct task_struct *p)
> {
> + struct tty_struct *tty;
> +
>   spin_lock_irq(>sighand->siglock);
> + tty = p->signal->tty;
> + if (tty) {
> + put_pid(tty->session);
> + put_pid(tty->pgrp);
> + }
>   p->signal->tty = NULL;
>   spin_unlock_irq(>sighand->siglock);
> }

This patch can't be right.  Not the way proc_clear_tty is called
once for each process in the session, plus we aren't clearing
tty->session and tty->pgrp here. 

If the above patch works it's a fluke.

Still it is the right general area of the code.  I've just started
looking at this it is going to take me a bit to come up to speed on
this code again and see what silly thing is missing.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-13 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 On 09/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 If I can manage to focus on this, it looks like the information I need to
 start fixing this.

 I had a look at the second leak reported it seems to be caused by the
 same proc_set_tty() call but, in this case, there is no
 disassociate_tty() call for the task (and the patch I posted is not
 enough). Maybe something like below (no thourough testing):

 diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
 index db91398..ea6ca7d 100644
 --- a/drivers/char/tty_io.c
 +++ b/drivers/char/tty_io.c
 @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum);

 void proc_clear_tty(struct task_struct *p)
 {
 + struct tty_struct *tty;
 +
   spin_lock_irq(p-sighand-siglock);
 + tty = p-signal-tty;
 + if (tty) {
 + put_pid(tty-session);
 + put_pid(tty-pgrp);
 + }
   p-signal-tty = NULL;
   spin_unlock_irq(p-sighand-siglock);
 }

This patch can't be right.  Not the way proc_clear_tty is called
once for each process in the session, plus we aren't clearing
tty-session and tty-pgrp here. 

If the above patch works it's a fluke.

Still it is the right general area of the code.  I've just started
looking at this it is going to take me a bit to come up to speed on
this code again and see what silly thing is missing.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-12 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> If I can manage to focus on this, it looks like the information I need to
>> start fixing this.
>
> I had a look at the second leak reported it seems to be caused by the
> same proc_set_tty() call but, in this case, there is no
> disassociate_tty() call for the task (and the patch I posted is not
> enough). Maybe something like below (no thourough testing):

Thanks.  That doesn't look to bad. I'm just about to look into this.
If I'm lucky all I will need to do is a close code review of your
patch.

Hopefully I can do that this afternoon.

> diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
> index db91398..ea6ca7d 100644
> --- a/drivers/char/tty_io.c
> +++ b/drivers/char/tty_io.c
> @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum);
>
> void proc_clear_tty(struct task_struct *p)
> {
> + struct tty_struct *tty;
> +
>   spin_lock_irq(>sighand->siglock);
> + tty = p->signal->tty;
> + if (tty) {
> + put_pid(tty->session);
> + put_pid(tty->pgrp);
> + }
>   p->signal->tty = NULL;
>   spin_unlock_irq(>sighand->siglock);
> }

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-12 Thread Catalin Marinas

On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

If I can manage to focus on this, it looks like the information I need to
start fixing this.


I had a look at the second leak reported it seems to be caused by the
same proc_set_tty() call but, in this case, there is no
disassociate_tty() call for the task (and the patch I posted is not
enough). Maybe something like below (no thourough testing):

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index db91398..ea6ca7d 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum);

void proc_clear_tty(struct task_struct *p)
{
+   struct tty_struct *tty;
+
spin_lock_irq(>sighand->siglock);
+   tty = p->signal->tty;
+   if (tty) {
+   put_pid(tty->session);
+   put_pid(tty->pgrp);
+   }
p->signal->tty = NULL;
spin_unlock_irq(>sighand->siglock);
}

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-12 Thread Catalin Marinas

On 09/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

If I can manage to focus on this, it looks like the information I need to
start fixing this.


I had a look at the second leak reported it seems to be caused by the
same proc_set_tty() call but, in this case, there is no
disassociate_tty() call for the task (and the patch I posted is not
enough). Maybe something like below (no thourough testing):

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index db91398..ea6ca7d 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum);

void proc_clear_tty(struct task_struct *p)
{
+   struct tty_struct *tty;
+
spin_lock_irq(p-sighand-siglock);
+   tty = p-signal-tty;
+   if (tty) {
+   put_pid(tty-session);
+   put_pid(tty-pgrp);
+   }
p-signal-tty = NULL;
spin_unlock_irq(p-sighand-siglock);
}

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-12 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 On 09/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 If I can manage to focus on this, it looks like the information I need to
 start fixing this.

 I had a look at the second leak reported it seems to be caused by the
 same proc_set_tty() call but, in this case, there is no
 disassociate_tty() call for the task (and the patch I posted is not
 enough). Maybe something like below (no thourough testing):

Thanks.  That doesn't look to bad. I'm just about to look into this.
If I'm lucky all I will need to do is a close code review of your
patch.

Hopefully I can do that this afternoon.

 diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
 index db91398..ea6ca7d 100644
 --- a/drivers/char/tty_io.c
 +++ b/drivers/char/tty_io.c
 @@ -3854,7 +3854,14 @@ EXPORT_SYMBOL(tty_devnum);

 void proc_clear_tty(struct task_struct *p)
 {
 + struct tty_struct *tty;
 +
   spin_lock_irq(p-sighand-siglock);
 + tty = p-signal-tty;
 + if (tty) {
 + put_pid(tty-session);
 + put_pid(tty-pgrp);
 + }
   p-signal-tty = NULL;
   spin_unlock_irq(p-sighand-siglock);
 }

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-09 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> Eric,
>
> For a longer explanation, see the second part of this e-mail. In
> short, the patch below seems to fix this particular leak. I'm not sure
> that's the correct/complete fix as I seem to still get a 2nd report.
> Any info is welcomed.

Sure.  I was starting to suspect that location myself.

> diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
> index e453268..4e33dc2 100644
> --- a/drivers/char/tty_io.c
> +++ b/drivers/char/tty_io.c
> @@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work)
>   }
>   read_unlock(_lock);
>
> + put_pid(tty->session);
> + put_pid(tty->pgrp);
> +
>   tty->flags = 0;
>   tty->session = NULL;
>   tty->pgrp = NULL;
>
> On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> "Catalin Marinas" <[EMAIL PROTECTED]> writes:
>> > The /sbin/init application calls sys_clone() a few times but only one
>> > leak is reported (see below). Looking at the reported pid object (at
>> > 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
>> > exists any more.
> [...]
>> > unreferenced object 0xc7c14500 (size 36):
>> >  comm "init", pid 245, jiffies 4294939289
>> >  backtrace:
>> >[] kmem_cache_alloc
>> >[] alloc_pid
>> >[] do_fork
>> >[] sys_clone
>> >[] ret_fast_syscall
>>
>> I think this is the path that all pid structures come from so
>> unfortunately that doesn't help tracing this problem down.
>
> No, indeed, but that's the only thing kmemleak can report. Anyway, I
> got some more information now, after adding several printk's:
>
> The difference from other pid objects is that this one (with nr 296)
> is passed as a parameter to proc_set_tty(). The __proc_set_tty()
> function increments the pid->count twice via get_pid(), and, with two
> other get_pid calls, the pid->count for this object gets to 5 (1 being
> the initial value). The prints below are function name, struct pid
> address (different from the runs yesterday though), pid->nr and
> pid->count (after get_pid incrementing). It also show the return
> address and symbol (the calling function):
>
>  alloc_pid: c7c149d8, 296, 1
>  get_pid: c7c149d8, 296, 2
>return: c0122d64 (proc_set_tty+0x34/0x54)
>  get_pid: c7c149d8, 296, 3
>return: c0122d64 (proc_set_tty+0x34/0x54)
>  get_pid: c7c149d8, 296, 4
>return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid
>  in disassociate_ctty but it is reported like this because of get_pid
>  inlining
>  get_pid: c7c149d8, 296, 5
>return: c0124a0c (tty_vhangup+0x14/0x18)
>
> On the exit path (see below), however, put_pid is called twice before
> free_pid and once via release_task -> detach_pid -> free_pid -> ... ->
> __rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that
> free_pid is called with pid->nr == 3 and the last put_pid gets called
> with nr == 3 as well (but it decrements it to 2 and that's what I find
> at that memory location). In the trace below, the pid->count is
> printed before put_pid modifies it:
>
>  put_pid: c7c149d8, 296, 5
>return: c0124b5c (disassociate_ctty+0x14c/0x230)
>  put_pid: c7c149d8, 296, 4
>return: c0124ba8 (disassociate_ctty+0x198/0x230)
>  detach_pid: c7c149d8, 296, 3
>return: c002a230 (release_task+0x1c0/0x358)
>  detach_pid: c7c149d8, 296, 3
>return: c002a248 (release_task+0x1d8/0x358)
>  detach_pid: c7c149d8, 296, 3
>return: c002a254 (release_task+0x1e4/0x358)
>  free_pid: c7c149d8, 296, 3
>return: c003a990 (detach_pid+0xac/0xc8)
>  ...
>  delayed_put_pid: c7c149d8, 296, 3
>return: c003af68 (__rcu_process_callbacks+0x19c/0x25c)
>  put_pid: c7c149d8, 296, 3
>return: c003a8cc (delayed_put_pid+0x54/0x6c)
>
> In the above disassociate_ctty() function the code below (line 1542)
> doesn't seem to get called:
>
>   tty = get_current_tty();
>   if (tty) {
>   put_pid(tty->session);
>   put_pid(tty->pgrp);
>   tty->session = NULL;
>   tty->pgrp = NULL;
>   } else {
>
> and I get the following error if TTY_DEBUG_HANGUP is defined - "error
> attempted to write to tty [0x] = NULL".
>
> It looks like the tty_vhangup() call in in disassociate_ctty() sets
> current->signal->tty to NULL in the do_each_pid_task loop in
> do_tty_hangup (p->signal->tty = NULL). The second call to
> get_current_tty() in disassociate_ctty() return NULL and therefore no
> put_pid on tty->session and tty->pgrp (which are also set to NULL in
> the previous function).

Thanks.

If I can manage to focus on this, it looks like the information I need to 
start fixing this.

Adding the reference counting when we didn't have any before is always
interesting.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-09 Thread Catalin Marinas

On 09/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> "Catalin Marinas" <[EMAIL PROTECTED]> writes:
>
> I think it's only the pid_chain and rcu member that could be placed in
> a list and kmemleak scans the memory for these two offsets as well.
> I'll check those lists anyway but I doubt it's a more fundamental
> problem with how kmemleak handles struct pid as I should've probably
> got more reports.

Right.  I was pointing out the possibilities but because we do
some tricky things.  Mostly I was wondering about the hlist for
the list of tasks.  Now if a task is on that list we should have
a struct pid_link pointing at our struct pid, so it shouldn't fool
kmemleak but I'm still a little curious if all of those hlist_heads are
NULL pointers.


Yes, all the 3 hlist_head tasks are NULL pointers on the reported object.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-09 Thread Catalin Marinas

Eric,

For a longer explanation, see the second part of this e-mail. In
short, the patch below seems to fix this particular leak. I'm not sure
that's the correct/complete fix as I seem to still get a 2nd report.
Any info is welcomed.

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..4e33dc2 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work)
}
read_unlock(_lock);

+   put_pid(tty->session);
+   put_pid(tty->pgrp);
+
tty->flags = 0;
tty->session = NULL;
tty->pgrp = NULL;

On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Catalin Marinas" <[EMAIL PROTECTED]> writes:
> The /sbin/init application calls sys_clone() a few times but only one
> leak is reported (see below). Looking at the reported pid object (at
> 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
> exists any more.

[...]

> unreferenced object 0xc7c14500 (size 36):
>  comm "init", pid 245, jiffies 4294939289
>  backtrace:
>[] kmem_cache_alloc
>[] alloc_pid
>[] do_fork
>[] sys_clone
>[] ret_fast_syscall

I think this is the path that all pid structures come from so
unfortunately that doesn't help tracing this problem down.


No, indeed, but that's the only thing kmemleak can report. Anyway, I
got some more information now, after adding several printk's:

The difference from other pid objects is that this one (with nr 296)
is passed as a parameter to proc_set_tty(). The __proc_set_tty()
function increments the pid->count twice via get_pid(), and, with two
other get_pid calls, the pid->count for this object gets to 5 (1 being
the initial value). The prints below are function name, struct pid
address (different from the runs yesterday though), pid->nr and
pid->count (after get_pid incrementing). It also show the return
address and symbol (the calling function):

 alloc_pid: c7c149d8, 296, 1
 get_pid: c7c149d8, 296, 2
   return: c0122d64 (proc_set_tty+0x34/0x54)
 get_pid: c7c149d8, 296, 3
   return: c0122d64 (proc_set_tty+0x34/0x54)
 get_pid: c7c149d8, 296, 4
   return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid
 in disassociate_ctty but it is reported like this because of get_pid
 inlining
 get_pid: c7c149d8, 296, 5
   return: c0124a0c (tty_vhangup+0x14/0x18)

On the exit path (see below), however, put_pid is called twice before
free_pid and once via release_task -> detach_pid -> free_pid -> ... ->
__rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that
free_pid is called with pid->nr == 3 and the last put_pid gets called
with nr == 3 as well (but it decrements it to 2 and that's what I find
at that memory location). In the trace below, the pid->count is
printed before put_pid modifies it:

 put_pid: c7c149d8, 296, 5
   return: c0124b5c (disassociate_ctty+0x14c/0x230)
 put_pid: c7c149d8, 296, 4
   return: c0124ba8 (disassociate_ctty+0x198/0x230)
 detach_pid: c7c149d8, 296, 3
   return: c002a230 (release_task+0x1c0/0x358)
 detach_pid: c7c149d8, 296, 3
   return: c002a248 (release_task+0x1d8/0x358)
 detach_pid: c7c149d8, 296, 3
   return: c002a254 (release_task+0x1e4/0x358)
 free_pid: c7c149d8, 296, 3
   return: c003a990 (detach_pid+0xac/0xc8)
 ...
 delayed_put_pid: c7c149d8, 296, 3
   return: c003af68 (__rcu_process_callbacks+0x19c/0x25c)
 put_pid: c7c149d8, 296, 3
   return: c003a8cc (delayed_put_pid+0x54/0x6c)

In the above disassociate_ctty() function the code below (line 1542)
doesn't seem to get called:

tty = get_current_tty();
if (tty) {
put_pid(tty->session);
put_pid(tty->pgrp);
tty->session = NULL;
tty->pgrp = NULL;
} else {

and I get the following error if TTY_DEBUG_HANGUP is defined - "error
attempted to write to tty [0x] = NULL".

It looks like the tty_vhangup() call in in disassociate_ctty() sets
current->signal->tty to NULL in the do_each_pid_task loop in
do_tty_hangup (p->signal->tty = NULL). The second call to
get_current_tty() in disassociate_ctty() return NULL and therefore no
put_pid on tty->session and tty->pgrp (which are also set to NULL in
the previous function).

Regards.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-09 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> "Catalin Marinas" <[EMAIL PROTECTED]> writes:
>
> I think it's only the pid_chain and rcu member that could be placed in
> a list and kmemleak scans the memory for these two offsets as well.
> I'll check those lists anyway but I doubt it's a more fundamental
> problem with how kmemleak handles struct pid as I should've probably
> got more reports.

Right.  I was pointing out the possibilities but because we do
some tricky things.  Mostly I was wondering about the hlist for
the list of tasks.  Now if a task is on that list we should have
a struct pid_link pointing at our struct pid, so it shouldn't fool
kmemleak but I'm still a little curious if all of those hlist_heads are
NULL pointers.

>> In most any other layer we cache pids indefinitely and a situation
>> where we have a pointer to a struct pid with a ref count of 1 long
>> after the process goes away is expected.
>
> Yes, indeed, but what kmemleak reports is that the pid structure
> wasn't freed yet and there is no way to determine its pointer directly
> or via container_of on members (by scanning the memory), hence it is
> considered a leak.

Yes that sounds like a leak.

>> I don't understand your situation enough to guess what is going wrong
>> yet.  Hopefully I have given you enough information to get started.
>
> Yes, many thanks. I'll dig further and let you know.

Thanks

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-09 Thread Catalin Marinas

On 08/03/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Catalin Marinas" <[EMAIL PROTECTED]> writes:
> I'm trying to track down a kmemleak report (on an ARM platform) which
> seems to have appeared with commit
> ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the
> TTY layer at all, is it possible that the above commit missed a
> put_pid() call on some path?

I won't arbitrarily rule a missing put_pid out.  I have been know to
goof up upon occasion.


I'm not entirely sure it's this part of the code, I would have to do
some more investigations (I didn't get this leak before). An
"unscientific" test shows that if I define get_pid/put_pid in the
tty_io.c file so that pid->count is not affected, the leak disappears.
This doesn't necessarily prove that the fault is here though.


I just did a quick look to see what kmemleak is.  A conservative
tracing leak detector sounds interesting.  Except for all of the list
heads which lead to container_of calls I don't know of anything in the
struct pid implementation that would be difficult for it to work with.
Well that and there is some rcu access protection which can delay the
free by a bit.


Kmemleak can cope with list heads and rcu delayed freeing as it also
checks for pointer aliases (those accessible via container_of).


> The /sbin/init application calls sys_clone() a few times but only one
> leak is reported (see below). Looking at the reported pid object (at
> 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
> exists any more.

It could still be a valid session or a process group id.
If you examine the struct pid you can test for this be examining all
of the list heads it keeps.  If there is something on any of the
lists that would account a count of 1.  How we have a count of 2
I don't have enough information to guess.


I think it's only the pid_chain and rcu member that could be placed in
a list and kmemleak scans the memory for these two offsets as well.
I'll check those lists anyway but I doubt it's a more fundamental
problem with how kmemleak handles struct pid as I should've probably
got more reports.


In most any other layer we cache pids indefinitely and a situation
where we have a pointer to a struct pid with a ref count of 1 long
after the process goes away is expected.


Yes, indeed, but what kmemleak reports is that the pid structure
wasn't freed yet and there is no way to determine its pointer directly
or via container_of on members (by scanning the memory), hence it is
considered a leak.


I don't understand your situation enough to guess what is going wrong
yet.  Hopefully I have given you enough information to get started.


Yes, many thanks. I'll dig further and let you know.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-09 Thread Catalin Marinas

On 08/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Catalin Marinas [EMAIL PROTECTED] writes:
 I'm trying to track down a kmemleak report (on an ARM platform) which
 seems to have appeared with commit
 ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the
 TTY layer at all, is it possible that the above commit missed a
 put_pid() call on some path?

I won't arbitrarily rule a missing put_pid out.  I have been know to
goof up upon occasion.


I'm not entirely sure it's this part of the code, I would have to do
some more investigations (I didn't get this leak before). An
unscientific test shows that if I define get_pid/put_pid in the
tty_io.c file so that pid-count is not affected, the leak disappears.
This doesn't necessarily prove that the fault is here though.


I just did a quick look to see what kmemleak is.  A conservative
tracing leak detector sounds interesting.  Except for all of the list
heads which lead to container_of calls I don't know of anything in the
struct pid implementation that would be difficult for it to work with.
Well that and there is some rcu access protection which can delay the
free by a bit.


Kmemleak can cope with list heads and rcu delayed freeing as it also
checks for pointer aliases (those accessible via container_of).


 The /sbin/init application calls sys_clone() a few times but only one
 leak is reported (see below). Looking at the reported pid object (at
 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
 exists any more.

It could still be a valid session or a process group id.
If you examine the struct pid you can test for this be examining all
of the list heads it keeps.  If there is something on any of the
lists that would account a count of 1.  How we have a count of 2
I don't have enough information to guess.


I think it's only the pid_chain and rcu member that could be placed in
a list and kmemleak scans the memory for these two offsets as well.
I'll check those lists anyway but I doubt it's a more fundamental
problem with how kmemleak handles struct pid as I should've probably
got more reports.


In most any other layer we cache pids indefinitely and a situation
where we have a pointer to a struct pid with a ref count of 1 long
after the process goes away is expected.


Yes, indeed, but what kmemleak reports is that the pid structure
wasn't freed yet and there is no way to determine its pointer directly
or via container_of on members (by scanning the memory), hence it is
considered a leak.


I don't understand your situation enough to guess what is going wrong
yet.  Hopefully I have given you enough information to get started.


Yes, many thanks. I'll dig further and let you know.

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-09 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 On 08/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 Catalin Marinas [EMAIL PROTECTED] writes:

 I think it's only the pid_chain and rcu member that could be placed in
 a list and kmemleak scans the memory for these two offsets as well.
 I'll check those lists anyway but I doubt it's a more fundamental
 problem with how kmemleak handles struct pid as I should've probably
 got more reports.

Right.  I was pointing out the possibilities but because we do
some tricky things.  Mostly I was wondering about the hlist for
the list of tasks.  Now if a task is on that list we should have
a struct pid_link pointing at our struct pid, so it shouldn't fool
kmemleak but I'm still a little curious if all of those hlist_heads are
NULL pointers.

 In most any other layer we cache pids indefinitely and a situation
 where we have a pointer to a struct pid with a ref count of 1 long
 after the process goes away is expected.

 Yes, indeed, but what kmemleak reports is that the pid structure
 wasn't freed yet and there is no way to determine its pointer directly
 or via container_of on members (by scanning the memory), hence it is
 considered a leak.

Yes that sounds like a leak.

 I don't understand your situation enough to guess what is going wrong
 yet.  Hopefully I have given you enough information to get started.

 Yes, many thanks. I'll dig further and let you know.

Thanks

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-09 Thread Catalin Marinas

Eric,

For a longer explanation, see the second part of this e-mail. In
short, the patch below seems to fix this particular leak. I'm not sure
that's the correct/complete fix as I seem to still get a 2nd report.
Any info is welcomed.

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..4e33dc2 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work)
}
read_unlock(tasklist_lock);

+   put_pid(tty-session);
+   put_pid(tty-pgrp);
+
tty-flags = 0;
tty-session = NULL;
tty-pgrp = NULL;

On 08/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Catalin Marinas [EMAIL PROTECTED] writes:
 The /sbin/init application calls sys_clone() a few times but only one
 leak is reported (see below). Looking at the reported pid object (at
 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
 exists any more.

[...]

 unreferenced object 0xc7c14500 (size 36):
  comm init, pid 245, jiffies 4294939289
  backtrace:
[c0070c18] kmem_cache_alloc
[c003a528] alloc_pid
[c0026468] do_fork
[c00153b0] sys_clone
[c0010f80] ret_fast_syscall

I think this is the path that all pid structures come from so
unfortunately that doesn't help tracing this problem down.


No, indeed, but that's the only thing kmemleak can report. Anyway, I
got some more information now, after adding several printk's:

The difference from other pid objects is that this one (with nr 296)
is passed as a parameter to proc_set_tty(). The __proc_set_tty()
function increments the pid-count twice via get_pid(), and, with two
other get_pid calls, the pid-count for this object gets to 5 (1 being
the initial value). The prints below are function name, struct pid
address (different from the runs yesterday though), pid-nr and
pid-count (after get_pid incrementing). It also show the return
address and symbol (the calling function):

 alloc_pid: c7c149d8, 296, 1
 get_pid: c7c149d8, 296, 2
   return: c0122d64 (proc_set_tty+0x34/0x54)
 get_pid: c7c149d8, 296, 3
   return: c0122d64 (proc_set_tty+0x34/0x54)
 get_pid: c7c149d8, 296, 4
   return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid
 in disassociate_ctty but it is reported like this because of get_pid
 inlining
 get_pid: c7c149d8, 296, 5
   return: c0124a0c (tty_vhangup+0x14/0x18)

On the exit path (see below), however, put_pid is called twice before
free_pid and once via release_task - detach_pid - free_pid - ... -
__rcu_process_callbacks - delayed_put_pid - put_pid. Note that
free_pid is called with pid-nr == 3 and the last put_pid gets called
with nr == 3 as well (but it decrements it to 2 and that's what I find
at that memory location). In the trace below, the pid-count is
printed before put_pid modifies it:

 put_pid: c7c149d8, 296, 5
   return: c0124b5c (disassociate_ctty+0x14c/0x230)
 put_pid: c7c149d8, 296, 4
   return: c0124ba8 (disassociate_ctty+0x198/0x230)
 detach_pid: c7c149d8, 296, 3
   return: c002a230 (release_task+0x1c0/0x358)
 detach_pid: c7c149d8, 296, 3
   return: c002a248 (release_task+0x1d8/0x358)
 detach_pid: c7c149d8, 296, 3
   return: c002a254 (release_task+0x1e4/0x358)
 free_pid: c7c149d8, 296, 3
   return: c003a990 (detach_pid+0xac/0xc8)
 ...
 delayed_put_pid: c7c149d8, 296, 3
   return: c003af68 (__rcu_process_callbacks+0x19c/0x25c)
 put_pid: c7c149d8, 296, 3
   return: c003a8cc (delayed_put_pid+0x54/0x6c)

In the above disassociate_ctty() function the code below (line 1542)
doesn't seem to get called:

tty = get_current_tty();
if (tty) {
put_pid(tty-session);
put_pid(tty-pgrp);
tty-session = NULL;
tty-pgrp = NULL;
} else {

and I get the following error if TTY_DEBUG_HANGUP is defined - error
attempted to write to tty [0x] = NULL.

It looks like the tty_vhangup() call in in disassociate_ctty() sets
current-signal-tty to NULL in the do_each_pid_task loop in
do_tty_hangup (p-signal-tty = NULL). The second call to
get_current_tty() in disassociate_ctty() return NULL and therefore no
put_pid on tty-session and tty-pgrp (which are also set to NULL in
the previous function).

Regards.

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-09 Thread Catalin Marinas

On 09/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Catalin Marinas [EMAIL PROTECTED] writes:

 On 08/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 Catalin Marinas [EMAIL PROTECTED] writes:

 I think it's only the pid_chain and rcu member that could be placed in
 a list and kmemleak scans the memory for these two offsets as well.
 I'll check those lists anyway but I doubt it's a more fundamental
 problem with how kmemleak handles struct pid as I should've probably
 got more reports.

Right.  I was pointing out the possibilities but because we do
some tricky things.  Mostly I was wondering about the hlist for
the list of tasks.  Now if a task is on that list we should have
a struct pid_link pointing at our struct pid, so it shouldn't fool
kmemleak but I'm still a little curious if all of those hlist_heads are
NULL pointers.


Yes, all the 3 hlist_head tasks are NULL pointers on the reported object.

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-09 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 Eric,

 For a longer explanation, see the second part of this e-mail. In
 short, the patch below seems to fix this particular leak. I'm not sure
 that's the correct/complete fix as I seem to still get a 2nd report.
 Any info is welcomed.

Sure.  I was starting to suspect that location myself.

 diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
 index e453268..4e33dc2 100644
 --- a/drivers/char/tty_io.c
 +++ b/drivers/char/tty_io.c
 @@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work)
   }
   read_unlock(tasklist_lock);

 + put_pid(tty-session);
 + put_pid(tty-pgrp);
 +
   tty-flags = 0;
   tty-session = NULL;
   tty-pgrp = NULL;

 On 08/03/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 Catalin Marinas [EMAIL PROTECTED] writes:
  The /sbin/init application calls sys_clone() a few times but only one
  leak is reported (see below). Looking at the reported pid object (at
  0xc7c14500), count is 2 and nr is 296 but no process with pid 296
  exists any more.
 [...]
  unreferenced object 0xc7c14500 (size 36):
   comm init, pid 245, jiffies 4294939289
   backtrace:
 [c0070c18] kmem_cache_alloc
 [c003a528] alloc_pid
 [c0026468] do_fork
 [c00153b0] sys_clone
 [c0010f80] ret_fast_syscall

 I think this is the path that all pid structures come from so
 unfortunately that doesn't help tracing this problem down.

 No, indeed, but that's the only thing kmemleak can report. Anyway, I
 got some more information now, after adding several printk's:

 The difference from other pid objects is that this one (with nr 296)
 is passed as a parameter to proc_set_tty(). The __proc_set_tty()
 function increments the pid-count twice via get_pid(), and, with two
 other get_pid calls, the pid-count for this object gets to 5 (1 being
 the initial value). The prints below are function name, struct pid
 address (different from the runs yesterday though), pid-nr and
 pid-count (after get_pid incrementing). It also show the return
 address and symbol (the calling function):

  alloc_pid: c7c149d8, 296, 1
  get_pid: c7c149d8, 296, 2
return: c0122d64 (proc_set_tty+0x34/0x54)
  get_pid: c7c149d8, 296, 3
return: c0122d64 (proc_set_tty+0x34/0x54)
  get_pid: c7c149d8, 296, 4
return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid
  in disassociate_ctty but it is reported like this because of get_pid
  inlining
  get_pid: c7c149d8, 296, 5
return: c0124a0c (tty_vhangup+0x14/0x18)

 On the exit path (see below), however, put_pid is called twice before
 free_pid and once via release_task - detach_pid - free_pid - ... -
 __rcu_process_callbacks - delayed_put_pid - put_pid. Note that
 free_pid is called with pid-nr == 3 and the last put_pid gets called
 with nr == 3 as well (but it decrements it to 2 and that's what I find
 at that memory location). In the trace below, the pid-count is
 printed before put_pid modifies it:

  put_pid: c7c149d8, 296, 5
return: c0124b5c (disassociate_ctty+0x14c/0x230)
  put_pid: c7c149d8, 296, 4
return: c0124ba8 (disassociate_ctty+0x198/0x230)
  detach_pid: c7c149d8, 296, 3
return: c002a230 (release_task+0x1c0/0x358)
  detach_pid: c7c149d8, 296, 3
return: c002a248 (release_task+0x1d8/0x358)
  detach_pid: c7c149d8, 296, 3
return: c002a254 (release_task+0x1e4/0x358)
  free_pid: c7c149d8, 296, 3
return: c003a990 (detach_pid+0xac/0xc8)
  ...
  delayed_put_pid: c7c149d8, 296, 3
return: c003af68 (__rcu_process_callbacks+0x19c/0x25c)
  put_pid: c7c149d8, 296, 3
return: c003a8cc (delayed_put_pid+0x54/0x6c)

 In the above disassociate_ctty() function the code below (line 1542)
 doesn't seem to get called:

   tty = get_current_tty();
   if (tty) {
   put_pid(tty-session);
   put_pid(tty-pgrp);
   tty-session = NULL;
   tty-pgrp = NULL;
   } else {

 and I get the following error if TTY_DEBUG_HANGUP is defined - error
 attempted to write to tty [0x] = NULL.

 It looks like the tty_vhangup() call in in disassociate_ctty() sets
 current-signal-tty to NULL in the do_each_pid_task loop in
 do_tty_hangup (p-signal-tty = NULL). The second call to
 get_current_tty() in disassociate_ctty() return NULL and therefore no
 put_pid on tty-session and tty-pgrp (which are also set to NULL in
 the previous function).

Thanks.

If I can manage to focus on this, it looks like the information I need to 
start fixing this.

Adding the reference counting when we didn't have any before is always
interesting.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible "struct pid" leak from tty_io.c

2007-03-08 Thread Eric W. Biederman
"Catalin Marinas" <[EMAIL PROTECTED]> writes:

> Hi Eric,
>
> I'm trying to track down a kmemleak report (on an ARM platform) which
> seems to have appeared with commit
> ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the
> TTY layer at all, is it possible that the above commit missed a
> put_pid() call on some path?

I won't arbitrarily rule a missing put_pid out.  I have been know to
goof up upon occasion.

I just did a quick look to see what kmemleak is.  A conservative
tracing leak detector sounds interesting.  Except for all of the list
heads which lead to container_of calls I don't know of anything in the
struct pid implementation that would be difficult for it to work with.
Well that and there is some rcu access protection which can delay the
free by a bit.

> The /sbin/init application calls sys_clone() a few times but only one
> leak is reported (see below). Looking at the reported pid object (at
> 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
> exists any more.

It could still be a valid session or a process group id.
If you examine the struct pid you can test for this be examining all
of the list heads it keeps.  If there is something on any of the
lists that would account a count of 1.  How we have a count of 2
I don't have enough information to guess.

Core tty layer handling stops having an remembering pids when the
session or process group exits so it is relatively safe from pid wrap
around issues without my changes and should make the kind of thing you
are reporting very unlikely in a correctly functioning system.

In most any other layer we cache pids indefinitely and a situation
where we have a pointer to a struct pid with a ref count of 1 long
after the process goes away is expected.  In this case it is better
to hold a reference to a struct pid so we don't do the wrong thing
when pid wrap around occurs then to hold a reference to an entire
task_struct and lock that in place.

I don't understand your situation enough to guess what is going wrong
yet.  Hopefully I have given you enough information to get started.


> unreferenced object 0xc7c14500 (size 36):
>  comm "init", pid 245, jiffies 4294939289
>  backtrace:
>[] kmem_cache_alloc
>[] alloc_pid
>[] do_fork
>[] sys_clone
>[] ret_fast_syscall

I think this is the path that all pid structures come from so
unfortunately that doesn't help tracing this problem down.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Possible "struct pid" leak from tty_io.c

2007-03-08 Thread Catalin Marinas

Hi Eric,

I'm trying to track down a kmemleak report (on an ARM platform) which
seems to have appeared with commit
ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the
TTY layer at all, is it possible that the above commit missed a
put_pid() call on some path?

The /sbin/init application calls sys_clone() a few times but only one
leak is reported (see below). Looking at the reported pid object (at
0xc7c14500), count is 2 and nr is 296 but no process with pid 296
exists any more.

unreferenced object 0xc7c14500 (size 36):
 comm "init", pid 245, jiffies 4294939289
 backtrace:
   [] kmem_cache_alloc
   [] alloc_pid
   [] do_fork
   [] sys_clone
   [] ret_fast_syscall

Thanks.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Possible struct pid leak from tty_io.c

2007-03-08 Thread Catalin Marinas

Hi Eric,

I'm trying to track down a kmemleak report (on an ARM platform) which
seems to have appeared with commit
ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the
TTY layer at all, is it possible that the above commit missed a
put_pid() call on some path?

The /sbin/init application calls sys_clone() a few times but only one
leak is reported (see below). Looking at the reported pid object (at
0xc7c14500), count is 2 and nr is 296 but no process with pid 296
exists any more.

unreferenced object 0xc7c14500 (size 36):
 comm init, pid 245, jiffies 4294939289
 backtrace:
   [c0070c18] kmem_cache_alloc
   [c003a528] alloc_pid
   [c0026468] do_fork
   [c00153b0] sys_clone
   [c0010f80] ret_fast_syscall

Thanks.

--
Catalin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Possible struct pid leak from tty_io.c

2007-03-08 Thread Eric W. Biederman
Catalin Marinas [EMAIL PROTECTED] writes:

 Hi Eric,

 I'm trying to track down a kmemleak report (on an ARM platform) which
 seems to have appeared with commit
 ab521dc0f8e117fd808d3e425216864d60390500. As I'm not familiar with the
 TTY layer at all, is it possible that the above commit missed a
 put_pid() call on some path?

I won't arbitrarily rule a missing put_pid out.  I have been know to
goof up upon occasion.

I just did a quick look to see what kmemleak is.  A conservative
tracing leak detector sounds interesting.  Except for all of the list
heads which lead to container_of calls I don't know of anything in the
struct pid implementation that would be difficult for it to work with.
Well that and there is some rcu access protection which can delay the
free by a bit.

 The /sbin/init application calls sys_clone() a few times but only one
 leak is reported (see below). Looking at the reported pid object (at
 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
 exists any more.

It could still be a valid session or a process group id.
If you examine the struct pid you can test for this be examining all
of the list heads it keeps.  If there is something on any of the
lists that would account a count of 1.  How we have a count of 2
I don't have enough information to guess.

Core tty layer handling stops having an remembering pids when the
session or process group exits so it is relatively safe from pid wrap
around issues without my changes and should make the kind of thing you
are reporting very unlikely in a correctly functioning system.

In most any other layer we cache pids indefinitely and a situation
where we have a pointer to a struct pid with a ref count of 1 long
after the process goes away is expected.  In this case it is better
to hold a reference to a struct pid so we don't do the wrong thing
when pid wrap around occurs then to hold a reference to an entire
task_struct and lock that in place.

I don't understand your situation enough to guess what is going wrong
yet.  Hopefully I have given you enough information to get started.


 unreferenced object 0xc7c14500 (size 36):
  comm init, pid 245, jiffies 4294939289
  backtrace:
[c0070c18] kmem_cache_alloc
[c003a528] alloc_pid
[c0026468] do_fork
[c00153b0] sys_clone
[c0010f80] ret_fast_syscall

I think this is the path that all pid structures come from so
unfortunately that doesn't help tracing this problem down.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/