Re: [RFC PATCH next]module: Fix mod->mkobj.kobj potentially freed too early

2013-08-22 Thread Li Zhong
On Thu, 2013-08-22 at 16:30 +0930, Rusty Russell wrote:
> Greg KH  writes:
> > On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
> > >  struct kobj_type module_ktype = {
> > > + .release   =module_kobj_release,
> > >   .sysfs_ops =_sysfs_ops,
> > >  };
> >
> > Wait, as there is no release function here for the kobject (a different
> > problem), why is the deferred release function causing any problems?
> > There is no release function to call, so what is causing the oops?
> 
> Because DEBUG_KOBJECT_RELEASE does the kobject_put() sometime later,
> which is what causes the oops.
> 
> Since kobjects don't have an owner field, AFAICT someone *could* grab
> one in a module which is unloading, then put it after unload.   So this
> fixes a real bug, albeit not one seen in the real world.
> 
> Applied,

Oh, thank you, Rusty. 

I just sent out another version... which fix it in another way as Greg
suggested, could you please also help to take a look at it?

Thanks, Zhong

> Rusty.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod->mkobj.kobj potentially freed too early

2013-08-22 Thread Rusty Russell
Greg KH  writes:
> On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
> >  struct kobj_type module_ktype = {
> > +   .release   =module_kobj_release,
> > .sysfs_ops =_sysfs_ops,
> >  };
>
> Wait, as there is no release function here for the kobject (a different
> problem), why is the deferred release function causing any problems?
> There is no release function to call, so what is causing the oops?

Because DEBUG_KOBJECT_RELEASE does the kobject_put() sometime later,
which is what causes the oops.

Since kobjects don't have an owner field, AFAICT someone *could* grab
one in a module which is unloading, then put it after unload.   So this
fixes a real bug, albeit not one seen in the real world.

Applied,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod-mkobj.kobj potentially freed too early

2013-08-22 Thread Rusty Russell
Greg KH gre...@linuxfoundation.org writes:
 On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
   struct kobj_type module_ktype = {
  +   .release   =module_kobj_release,
  .sysfs_ops =module_sysfs_ops,
   };

 Wait, as there is no release function here for the kobject (a different
 problem), why is the deferred release function causing any problems?
 There is no release function to call, so what is causing the oops?

Because DEBUG_KOBJECT_RELEASE does the kobject_put() sometime later,
which is what causes the oops.

Since kobjects don't have an owner field, AFAICT someone *could* grab
one in a module which is unloading, then put it after unload.   So this
fixes a real bug, albeit not one seen in the real world.

Applied,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod-mkobj.kobj potentially freed too early

2013-08-22 Thread Li Zhong
On Thu, 2013-08-22 at 16:30 +0930, Rusty Russell wrote:
 Greg KH gre...@linuxfoundation.org writes:
  On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
struct kobj_type module_ktype = {
   + .release   =module_kobj_release,
 .sysfs_ops =module_sysfs_ops,
};
 
  Wait, as there is no release function here for the kobject (a different
  problem), why is the deferred release function causing any problems?
  There is no release function to call, so what is causing the oops?
 
 Because DEBUG_KOBJECT_RELEASE does the kobject_put() sometime later,
 which is what causes the oops.
 
 Since kobjects don't have an owner field, AFAICT someone *could* grab
 one in a module which is unloading, then put it after unload.   So this
 fixes a real bug, albeit not one seen in the real world.
 
 Applied,

Oh, thank you, Rusty. 

I just sent out another version... which fix it in another way as Greg
suggested, could you please also help to take a look at it?

Thanks, Zhong

 Rusty.
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod->mkobj.kobj potentially freed too early

2013-08-21 Thread Li Zhong
On Wed, 2013-08-21 at 21:03 -0700, Greg KH wrote:
> On Thu, Aug 22, 2013 at 10:34:06AM +0800, Li Zhong wrote:
> > On Wed, 2013-08-21 at 09:18 -0700, Greg KH wrote:
> > > On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
> > > > DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
> > > > 
> > > > After some investigation, it seems the reason is:
> > > > The mod->mkobj.kobj(a01600d0 below) is freed together with mod
> > > > itself in free_module(). However, its children still hold references to
> > > > it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
> > > > child(holders below) tries to decrease the reference count to its parent
> > > > in kobject_del(), BUG happens as it tries to access already freed 
> > > > memory.
> > > 
> > > Ah, thanks for tracking this down.  I had seen this in my local testing,
> > > but wasn't able to figure out the offending code.
> > > 
> > > > This patch tries to fix it by waiting for the mod->mkobj.kobj to be
> > > > really released in the module removing process (and some error code
> > > > paths).
> > > 
> > > Nasty, we should just be freeing the structure in the release function,
> > > why doesn't that work?
> > 
> > Hi Greg,
> > 
> > It seems I didn't describe it clearly in the previous mail. I'm trying
> > to do it better below: 
> > 
> > mod->mkobj.kobj is embedded in module_kobject(not a pointer):
> > struct module_kobject {
> > struct kobject kobj;
> > ...
> > 
> > and allocated with the module memory. So we could see the parent below
> > a01600d0 is between MODULES_VADDR (a000) and
> > MODULES_END(ff00). 
> > 
> > It seem to me that the mkobj.kobj is freed by module_free(mod,
> > mod->module_core). 
> 
> Ick, you are right.  If a kobject is being embedded in an object, it
> should control the lifespan of the object, not somewhere else like is
> happening here.
> 
> The best solution for this is to make the kobject a pointer, not
> embedded in the structure, that will fix this issue, right?

Yes, I think so. I'll try to write a fix using this way, thanks for your
suggestion.

Thanks, Zhong

> 
> thanks,
> 
> greg k-h
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod->mkobj.kobj potentially freed too early

2013-08-21 Thread Greg KH
On Thu, Aug 22, 2013 at 10:34:06AM +0800, Li Zhong wrote:
> On Wed, 2013-08-21 at 09:18 -0700, Greg KH wrote:
> > On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
> > > DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
> > > 
> > > After some investigation, it seems the reason is:
> > > The mod->mkobj.kobj(a01600d0 below) is freed together with mod
> > > itself in free_module(). However, its children still hold references to
> > > it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
> > > child(holders below) tries to decrease the reference count to its parent
> > > in kobject_del(), BUG happens as it tries to access already freed memory.
> > 
> > Ah, thanks for tracking this down.  I had seen this in my local testing,
> > but wasn't able to figure out the offending code.
> > 
> > > This patch tries to fix it by waiting for the mod->mkobj.kobj to be
> > > really released in the module removing process (and some error code
> > > paths).
> > 
> > Nasty, we should just be freeing the structure in the release function,
> > why doesn't that work?
> 
> Hi Greg,
> 
> It seems I didn't describe it clearly in the previous mail. I'm trying
> to do it better below: 
> 
> mod->mkobj.kobj is embedded in module_kobject(not a pointer):
> struct module_kobject {
> struct kobject kobj;
>   ...
> 
> and allocated with the module memory. So we could see the parent below
> a01600d0 is between MODULES_VADDR (a000) and
> MODULES_END(ff00). 
> 
> It seem to me that the mkobj.kobj is freed by module_free(mod,
> mod->module_core). 

Ick, you are right.  If a kobject is being embedded in an object, it
should control the lifespan of the object, not somewhere else like is
happening here.

The best solution for this is to make the kobject a pointer, not
embedded in the structure, that will fix this issue, right?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod->mkobj.kobj potentially freed too early

2013-08-21 Thread Li Zhong
On Wed, 2013-08-21 at 09:18 -0700, Greg KH wrote:
> On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
> > DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
> > 
> > After some investigation, it seems the reason is:
> > The mod->mkobj.kobj(a01600d0 below) is freed together with mod
> > itself in free_module(). However, its children still hold references to
> > it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
> > child(holders below) tries to decrease the reference count to its parent
> > in kobject_del(), BUG happens as it tries to access already freed memory.
> 
> Ah, thanks for tracking this down.  I had seen this in my local testing,
> but wasn't able to figure out the offending code.
> 
> > This patch tries to fix it by waiting for the mod->mkobj.kobj to be
> > really released in the module removing process (and some error code
> > paths).
> 
> Nasty, we should just be freeing the structure in the release function,
> why doesn't that work?

Hi Greg,

It seems I didn't describe it clearly in the previous mail. I'm trying
to do it better below: 

mod->mkobj.kobj is embedded in module_kobject(not a pointer):
struct module_kobject {
struct kobject kobj;
...

and allocated with the module memory. So we could see the parent below
a01600d0 is between MODULES_VADDR (a000) and
MODULES_END(ff00). 

It seem to me that the mkobj.kobj is freed by module_free(mod,
mod->module_core). 

So in free_module(), we need delay the call of module_free(), until
mkobj.kobj finishes its cleanup. 

> > [ 1844.175287] kobject: 'holders' (88007c1f1600): kobject_release, 
> > parent a01600d0 (delayed)
> > [ 1844.178991] kobject: 'notes' (8800370b2a00): kobject_release, parent 
> > a01600d0 (delayed)
> > [ 1845.180118] kobject: 'holders' (88007c1f1600): kobject_cleanup, 
> > parent a01600d0
> > [ 1845.182130] kobject: 'holders' (88007c1f1600): auto cleanup 
> > kobject_del
> > [ 1845.184120] BUG: unable to handle kernel paging request at 
> > a01601d0
> > [ 1845.185026] IP: [] kobject_put+0x11/0x60
> > [ 1845.185026] PGD 1a13067 PUD 1a14063 PMD 7bd30067 PTE 0
> > [ 1845.185026] Oops:  [#1] PREEMPT 
> > [ 1845.185026] Modules linked in: xfs libcrc32c [last unloaded: 
> > kprobe_example]
> > [ 1845.185026] CPU: 0 PID: 18 Comm: kworker/0:1 Tainted: G   O 
> > 3.11.0-rc6-next-20130819+ #1
> > [ 1845.185026] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> > [ 1845.185026] Workqueue: events kobject_delayed_cleanup
> > [ 1845.185026] task: 88007ca51f00 ti: 88007ca5c000 task.ti: 
> > 88007ca5c000
> > [ 1845.185026] RIP: 0010:[]  [] 
> > kobject_put+0x11/0x60
> > [ 1845.185026] RSP: 0018:88007ca5dd08  EFLAGS: 00010282
> > [ 1845.185026] RAX: 2000 RBX: a01600d0 RCX: 
> > 8177d638
> > [ 1845.185026] RDX: 88007ca5dc18 RSI:  RDI: 
> > a01600d0
> > [ 1845.185026] RBP: 88007ca5dd18 R08: 824e9810 R09: 
> > 
> > [ 1845.185026] R10: 8800 R11: dead4ead0001 R12: 
> > 81a95040
> > [ 1845.185026] R13: 88007b27a960 R14: 88007c1f1600 R15: 
> > 
> > [ 1845.185026] FS:  () GS:81a23000() 
> > knlGS:
> > [ 1845.185026] CS:  0010 DS:  ES:  CR0: 8005003b
> > [ 1845.185026] CR2: a01601d0 CR3: 37207000 CR4: 
> > 06b0
> > [ 1845.185026] Stack:
> > [ 1845.185026]  88007c1f1600 88007c1f1600 88007ca5dd38 
> > 812cdb7e
> > [ 1845.185026]   88007c1f1640 88007ca5dd68 
> > 812cdbfe
> > [ 1845.185026]  88007c974800 88007c1f1640 88007ff61a00 
> > 
> > [ 1845.185026] Call Trace:
> > [ 1845.185026]  [] kobject_del+0x2e/0x40
> > [ 1845.185026]  [] kobject_delayed_cleanup+0x6e/0x1d0
> > [ 1845.185026]  [] process_one_work+0x1e5/0x670
> > [ 1845.185026]  [] ? process_one_work+0x183/0x670
> > [ 1845.185026]  [] worker_thread+0x113/0x370
> > [ 1845.185026]  [] ? rescuer_thread+0x290/0x290
> > [ 1845.185026]  [] kthread+0xda/0xe0
> > [ 1845.185026]  [] ? _raw_spin_unlock_irq+0x30/0x60
> > [ 1845.185026]  [] ? kthread_create_on_node+0x130/0x130
> > [ 1845.185026]  [] ret_from_fork+0x7a/0xb0
> > [ 1845.185026]  [] ? kthread_create_on_node+0x130/0x130
> > [ 1845.185026] Code: 81 48 c7 c7 28 95 ad 81 31 c0 e8 9b da 01 00 e9 4f ff 
> > ff ff 66 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 1d 
> >  87 00 01 00 00 01 74 1e 48 8d 7b 38 83 6b 38 01 0f 94 c0 84 
> > [ 1845.185026] RIP  [] kobject_put+0x11/0x60
> > [ 1845.185026]  RSP 
> > [ 1845.185026] CR2: a01601d0
> > [ 1845.185026] ---[ end trace 49a70afd109f5653 ]---
> > 
> > Signed-off-by: Li Zhong 
> > ---
> >  include/linux/module.h |  1 +
> >  kernel/module.c| 14 +++---
> >  kernel/params.c|  7 +++
> >  3 files 

Re: [RFC PATCH next]module: Fix mod->mkobj.kobj potentially freed too early

2013-08-21 Thread Greg KH
On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
> DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
> 
> After some investigation, it seems the reason is:
> The mod->mkobj.kobj(a01600d0 below) is freed together with mod
> itself in free_module(). However, its children still hold references to
> it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
> child(holders below) tries to decrease the reference count to its parent
> in kobject_del(), BUG happens as it tries to access already freed memory.

Ah, thanks for tracking this down.  I had seen this in my local testing,
but wasn't able to figure out the offending code.

> This patch tries to fix it by waiting for the mod->mkobj.kobj to be
> really released in the module removing process (and some error code
> paths).

Nasty, we should just be freeing the structure in the release function,
why doesn't that work?

> [ 1844.175287] kobject: 'holders' (88007c1f1600): kobject_release, parent 
> a01600d0 (delayed)
> [ 1844.178991] kobject: 'notes' (8800370b2a00): kobject_release, parent 
> a01600d0 (delayed)
> [ 1845.180118] kobject: 'holders' (88007c1f1600): kobject_cleanup, parent 
> a01600d0
> [ 1845.182130] kobject: 'holders' (88007c1f1600): auto cleanup kobject_del
> [ 1845.184120] BUG: unable to handle kernel paging request at a01601d0
> [ 1845.185026] IP: [] kobject_put+0x11/0x60
> [ 1845.185026] PGD 1a13067 PUD 1a14063 PMD 7bd30067 PTE 0
> [ 1845.185026] Oops:  [#1] PREEMPT 
> [ 1845.185026] Modules linked in: xfs libcrc32c [last unloaded: 
> kprobe_example]
> [ 1845.185026] CPU: 0 PID: 18 Comm: kworker/0:1 Tainted: G   O 
> 3.11.0-rc6-next-20130819+ #1
> [ 1845.185026] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 1845.185026] Workqueue: events kobject_delayed_cleanup
> [ 1845.185026] task: 88007ca51f00 ti: 88007ca5c000 task.ti: 
> 88007ca5c000
> [ 1845.185026] RIP: 0010:[]  [] 
> kobject_put+0x11/0x60
> [ 1845.185026] RSP: 0018:88007ca5dd08  EFLAGS: 00010282
> [ 1845.185026] RAX: 2000 RBX: a01600d0 RCX: 
> 8177d638
> [ 1845.185026] RDX: 88007ca5dc18 RSI:  RDI: 
> a01600d0
> [ 1845.185026] RBP: 88007ca5dd18 R08: 824e9810 R09: 
> 
> [ 1845.185026] R10: 8800 R11: dead4ead0001 R12: 
> 81a95040
> [ 1845.185026] R13: 88007b27a960 R14: 88007c1f1600 R15: 
> 
> [ 1845.185026] FS:  () GS:81a23000() 
> knlGS:
> [ 1845.185026] CS:  0010 DS:  ES:  CR0: 8005003b
> [ 1845.185026] CR2: a01601d0 CR3: 37207000 CR4: 
> 06b0
> [ 1845.185026] Stack:
> [ 1845.185026]  88007c1f1600 88007c1f1600 88007ca5dd38 
> 812cdb7e
> [ 1845.185026]   88007c1f1640 88007ca5dd68 
> 812cdbfe
> [ 1845.185026]  88007c974800 88007c1f1640 88007ff61a00 
> 
> [ 1845.185026] Call Trace:
> [ 1845.185026]  [] kobject_del+0x2e/0x40
> [ 1845.185026]  [] kobject_delayed_cleanup+0x6e/0x1d0
> [ 1845.185026]  [] process_one_work+0x1e5/0x670
> [ 1845.185026]  [] ? process_one_work+0x183/0x670
> [ 1845.185026]  [] worker_thread+0x113/0x370
> [ 1845.185026]  [] ? rescuer_thread+0x290/0x290
> [ 1845.185026]  [] kthread+0xda/0xe0
> [ 1845.185026]  [] ? _raw_spin_unlock_irq+0x30/0x60
> [ 1845.185026]  [] ? kthread_create_on_node+0x130/0x130
> [ 1845.185026]  [] ret_from_fork+0x7a/0xb0
> [ 1845.185026]  [] ? kthread_create_on_node+0x130/0x130
> [ 1845.185026] Code: 81 48 c7 c7 28 95 ad 81 31 c0 e8 9b da 01 00 e9 4f ff ff 
> ff 66 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 1d  
> 87 00 01 00 00 01 74 1e 48 8d 7b 38 83 6b 38 01 0f 94 c0 84 
> [ 1845.185026] RIP  [] kobject_put+0x11/0x60
> [ 1845.185026]  RSP 
> [ 1845.185026] CR2: a01601d0
> [ 1845.185026] ---[ end trace 49a70afd109f5653 ]---
> 
> Signed-off-by: Li Zhong 
> ---
>  include/linux/module.h |  1 +
>  kernel/module.c| 14 +++---
>  kernel/params.c|  7 +++
>  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 504035f..05f2447 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -42,6 +42,7 @@ struct module_kobject {
>   struct module *mod;
>   struct kobject *drivers_dir;
>   struct module_param_attrs *mp;
> + struct completion *kobj_completion;
>  };
>  
>  struct module_attribute {
> diff --git a/kernel/module.c b/kernel/module.c
> index 2069158..b5e2baa 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -1611,6 +1611,14 @@ static void module_remove_modinfo_attrs(struct module 
> *mod)
>   kfree(mod->modinfo_attrs);
>  }
>  
> +static void mod_kobject_put(struct module *mod)
> +{
> + DECLARE_COMPLETION_ONSTACK(c);
> + mod->mkobj.kobj_completion = 
> +   

Re: [RFC PATCH next]module: Fix mod-mkobj.kobj potentially freed too early

2013-08-21 Thread Greg KH
On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
 DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
 
 After some investigation, it seems the reason is:
 The mod-mkobj.kobj(a01600d0 below) is freed together with mod
 itself in free_module(). However, its children still hold references to
 it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
 child(holders below) tries to decrease the reference count to its parent
 in kobject_del(), BUG happens as it tries to access already freed memory.

Ah, thanks for tracking this down.  I had seen this in my local testing,
but wasn't able to figure out the offending code.

 This patch tries to fix it by waiting for the mod-mkobj.kobj to be
 really released in the module removing process (and some error code
 paths).

Nasty, we should just be freeing the structure in the release function,
why doesn't that work?

 [ 1844.175287] kobject: 'holders' (88007c1f1600): kobject_release, parent 
 a01600d0 (delayed)
 [ 1844.178991] kobject: 'notes' (8800370b2a00): kobject_release, parent 
 a01600d0 (delayed)
 [ 1845.180118] kobject: 'holders' (88007c1f1600): kobject_cleanup, parent 
 a01600d0
 [ 1845.182130] kobject: 'holders' (88007c1f1600): auto cleanup kobject_del
 [ 1845.184120] BUG: unable to handle kernel paging request at a01601d0
 [ 1845.185026] IP: [812cda81] kobject_put+0x11/0x60
 [ 1845.185026] PGD 1a13067 PUD 1a14063 PMD 7bd30067 PTE 0
 [ 1845.185026] Oops:  [#1] PREEMPT 
 [ 1845.185026] Modules linked in: xfs libcrc32c [last unloaded: 
 kprobe_example]
 [ 1845.185026] CPU: 0 PID: 18 Comm: kworker/0:1 Tainted: G   O 
 3.11.0-rc6-next-20130819+ #1
 [ 1845.185026] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 [ 1845.185026] Workqueue: events kobject_delayed_cleanup
 [ 1845.185026] task: 88007ca51f00 ti: 88007ca5c000 task.ti: 
 88007ca5c000
 [ 1845.185026] RIP: 0010:[812cda81]  [812cda81] 
 kobject_put+0x11/0x60
 [ 1845.185026] RSP: 0018:88007ca5dd08  EFLAGS: 00010282
 [ 1845.185026] RAX: 2000 RBX: a01600d0 RCX: 
 8177d638
 [ 1845.185026] RDX: 88007ca5dc18 RSI:  RDI: 
 a01600d0
 [ 1845.185026] RBP: 88007ca5dd18 R08: 824e9810 R09: 
 
 [ 1845.185026] R10: 8800 R11: dead4ead0001 R12: 
 81a95040
 [ 1845.185026] R13: 88007b27a960 R14: 88007c1f1600 R15: 
 
 [ 1845.185026] FS:  () GS:81a23000() 
 knlGS:
 [ 1845.185026] CS:  0010 DS:  ES:  CR0: 8005003b
 [ 1845.185026] CR2: a01601d0 CR3: 37207000 CR4: 
 06b0
 [ 1845.185026] Stack:
 [ 1845.185026]  88007c1f1600 88007c1f1600 88007ca5dd38 
 812cdb7e
 [ 1845.185026]   88007c1f1640 88007ca5dd68 
 812cdbfe
 [ 1845.185026]  88007c974800 88007c1f1640 88007ff61a00 
 
 [ 1845.185026] Call Trace:
 [ 1845.185026]  [812cdb7e] kobject_del+0x2e/0x40
 [ 1845.185026]  [812cdbfe] kobject_delayed_cleanup+0x6e/0x1d0
 [ 1845.185026]  [81063a45] process_one_work+0x1e5/0x670
 [ 1845.185026]  [810639e3] ? process_one_work+0x183/0x670
 [ 1845.185026]  [810642b3] worker_thread+0x113/0x370
 [ 1845.185026]  [810641a0] ? rescuer_thread+0x290/0x290
 [ 1845.185026]  [8106bfba] kthread+0xda/0xe0
 [ 1845.185026]  [814ff0f0] ? _raw_spin_unlock_irq+0x30/0x60
 [ 1845.185026]  [8106bee0] ? kthread_create_on_node+0x130/0x130
 [ 1845.185026]  [8150751a] ret_from_fork+0x7a/0xb0
 [ 1845.185026]  [8106bee0] ? kthread_create_on_node+0x130/0x130
 [ 1845.185026] Code: 81 48 c7 c7 28 95 ad 81 31 c0 e8 9b da 01 00 e9 4f ff ff 
 ff 66 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 1d f6 
 87 00 01 00 00 01 74 1e 48 8d 7b 38 83 6b 38 01 0f 94 c0 84 
 [ 1845.185026] RIP  [812cda81] kobject_put+0x11/0x60
 [ 1845.185026]  RSP 88007ca5dd08
 [ 1845.185026] CR2: a01601d0
 [ 1845.185026] ---[ end trace 49a70afd109f5653 ]---
 
 Signed-off-by: Li Zhong zh...@linux.vnet.ibm.com
 ---
  include/linux/module.h |  1 +
  kernel/module.c| 14 +++---
  kernel/params.c|  7 +++
  3 files changed, 19 insertions(+), 3 deletions(-)
 
 diff --git a/include/linux/module.h b/include/linux/module.h
 index 504035f..05f2447 100644
 --- a/include/linux/module.h
 +++ b/include/linux/module.h
 @@ -42,6 +42,7 @@ struct module_kobject {
   struct module *mod;
   struct kobject *drivers_dir;
   struct module_param_attrs *mp;
 + struct completion *kobj_completion;
  };
  
  struct module_attribute {
 diff --git a/kernel/module.c b/kernel/module.c
 index 2069158..b5e2baa 100644
 --- a/kernel/module.c
 +++ b/kernel/module.c
 @@ -1611,6 +1611,14 @@ static void module_remove_modinfo_attrs(struct module 
 *mod)
   

Re: [RFC PATCH next]module: Fix mod-mkobj.kobj potentially freed too early

2013-08-21 Thread Li Zhong
On Wed, 2013-08-21 at 09:18 -0700, Greg KH wrote:
 On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
  DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
  
  After some investigation, it seems the reason is:
  The mod-mkobj.kobj(a01600d0 below) is freed together with mod
  itself in free_module(). However, its children still hold references to
  it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
  child(holders below) tries to decrease the reference count to its parent
  in kobject_del(), BUG happens as it tries to access already freed memory.
 
 Ah, thanks for tracking this down.  I had seen this in my local testing,
 but wasn't able to figure out the offending code.
 
  This patch tries to fix it by waiting for the mod-mkobj.kobj to be
  really released in the module removing process (and some error code
  paths).
 
 Nasty, we should just be freeing the structure in the release function,
 why doesn't that work?

Hi Greg,

It seems I didn't describe it clearly in the previous mail. I'm trying
to do it better below: 

mod-mkobj.kobj is embedded in module_kobject(not a pointer):
struct module_kobject {
struct kobject kobj;
...

and allocated with the module memory. So we could see the parent below
a01600d0 is between MODULES_VADDR (a000) and
MODULES_END(ff00). 

It seem to me that the mkobj.kobj is freed by module_free(mod,
mod-module_core). 

So in free_module(), we need delay the call of module_free(), until
mkobj.kobj finishes its cleanup. 

  [ 1844.175287] kobject: 'holders' (88007c1f1600): kobject_release, 
  parent a01600d0 (delayed)
  [ 1844.178991] kobject: 'notes' (8800370b2a00): kobject_release, parent 
  a01600d0 (delayed)
  [ 1845.180118] kobject: 'holders' (88007c1f1600): kobject_cleanup, 
  parent a01600d0
  [ 1845.182130] kobject: 'holders' (88007c1f1600): auto cleanup 
  kobject_del
  [ 1845.184120] BUG: unable to handle kernel paging request at 
  a01601d0
  [ 1845.185026] IP: [812cda81] kobject_put+0x11/0x60
  [ 1845.185026] PGD 1a13067 PUD 1a14063 PMD 7bd30067 PTE 0
  [ 1845.185026] Oops:  [#1] PREEMPT 
  [ 1845.185026] Modules linked in: xfs libcrc32c [last unloaded: 
  kprobe_example]
  [ 1845.185026] CPU: 0 PID: 18 Comm: kworker/0:1 Tainted: G   O 
  3.11.0-rc6-next-20130819+ #1
  [ 1845.185026] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
  [ 1845.185026] Workqueue: events kobject_delayed_cleanup
  [ 1845.185026] task: 88007ca51f00 ti: 88007ca5c000 task.ti: 
  88007ca5c000
  [ 1845.185026] RIP: 0010:[812cda81]  [812cda81] 
  kobject_put+0x11/0x60
  [ 1845.185026] RSP: 0018:88007ca5dd08  EFLAGS: 00010282
  [ 1845.185026] RAX: 2000 RBX: a01600d0 RCX: 
  8177d638
  [ 1845.185026] RDX: 88007ca5dc18 RSI:  RDI: 
  a01600d0
  [ 1845.185026] RBP: 88007ca5dd18 R08: 824e9810 R09: 
  
  [ 1845.185026] R10: 8800 R11: dead4ead0001 R12: 
  81a95040
  [ 1845.185026] R13: 88007b27a960 R14: 88007c1f1600 R15: 
  
  [ 1845.185026] FS:  () GS:81a23000() 
  knlGS:
  [ 1845.185026] CS:  0010 DS:  ES:  CR0: 8005003b
  [ 1845.185026] CR2: a01601d0 CR3: 37207000 CR4: 
  06b0
  [ 1845.185026] Stack:
  [ 1845.185026]  88007c1f1600 88007c1f1600 88007ca5dd38 
  812cdb7e
  [ 1845.185026]   88007c1f1640 88007ca5dd68 
  812cdbfe
  [ 1845.185026]  88007c974800 88007c1f1640 88007ff61a00 
  
  [ 1845.185026] Call Trace:
  [ 1845.185026]  [812cdb7e] kobject_del+0x2e/0x40
  [ 1845.185026]  [812cdbfe] kobject_delayed_cleanup+0x6e/0x1d0
  [ 1845.185026]  [81063a45] process_one_work+0x1e5/0x670
  [ 1845.185026]  [810639e3] ? process_one_work+0x183/0x670
  [ 1845.185026]  [810642b3] worker_thread+0x113/0x370
  [ 1845.185026]  [810641a0] ? rescuer_thread+0x290/0x290
  [ 1845.185026]  [8106bfba] kthread+0xda/0xe0
  [ 1845.185026]  [814ff0f0] ? _raw_spin_unlock_irq+0x30/0x60
  [ 1845.185026]  [8106bee0] ? kthread_create_on_node+0x130/0x130
  [ 1845.185026]  [8150751a] ret_from_fork+0x7a/0xb0
  [ 1845.185026]  [8106bee0] ? kthread_create_on_node+0x130/0x130
  [ 1845.185026] Code: 81 48 c7 c7 28 95 ad 81 31 c0 e8 9b da 01 00 e9 4f ff 
  ff ff 66 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 1d 
  f6 87 00 01 00 00 01 74 1e 48 8d 7b 38 83 6b 38 01 0f 94 c0 84 
  [ 1845.185026] RIP  [812cda81] kobject_put+0x11/0x60
  [ 1845.185026]  RSP 88007ca5dd08
  [ 1845.185026] CR2: a01601d0
  [ 1845.185026] ---[ end trace 49a70afd109f5653 ]---
  
  Signed-off-by: Li Zhong zh...@linux.vnet.ibm.com
  ---
   include/linux/module.h |  

Re: [RFC PATCH next]module: Fix mod-mkobj.kobj potentially freed too early

2013-08-21 Thread Greg KH
On Thu, Aug 22, 2013 at 10:34:06AM +0800, Li Zhong wrote:
 On Wed, 2013-08-21 at 09:18 -0700, Greg KH wrote:
  On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
   DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
   
   After some investigation, it seems the reason is:
   The mod-mkobj.kobj(a01600d0 below) is freed together with mod
   itself in free_module(). However, its children still hold references to
   it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
   child(holders below) tries to decrease the reference count to its parent
   in kobject_del(), BUG happens as it tries to access already freed memory.
  
  Ah, thanks for tracking this down.  I had seen this in my local testing,
  but wasn't able to figure out the offending code.
  
   This patch tries to fix it by waiting for the mod-mkobj.kobj to be
   really released in the module removing process (and some error code
   paths).
  
  Nasty, we should just be freeing the structure in the release function,
  why doesn't that work?
 
 Hi Greg,
 
 It seems I didn't describe it clearly in the previous mail. I'm trying
 to do it better below: 
 
 mod-mkobj.kobj is embedded in module_kobject(not a pointer):
 struct module_kobject {
 struct kobject kobj;
   ...
 
 and allocated with the module memory. So we could see the parent below
 a01600d0 is between MODULES_VADDR (a000) and
 MODULES_END(ff00). 
 
 It seem to me that the mkobj.kobj is freed by module_free(mod,
 mod-module_core). 

Ick, you are right.  If a kobject is being embedded in an object, it
should control the lifespan of the object, not somewhere else like is
happening here.

The best solution for this is to make the kobject a pointer, not
embedded in the structure, that will fix this issue, right?

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH next]module: Fix mod-mkobj.kobj potentially freed too early

2013-08-21 Thread Li Zhong
On Wed, 2013-08-21 at 21:03 -0700, Greg KH wrote:
 On Thu, Aug 22, 2013 at 10:34:06AM +0800, Li Zhong wrote:
  On Wed, 2013-08-21 at 09:18 -0700, Greg KH wrote:
   On Wed, Aug 21, 2013 at 05:49:58PM +0800, Li Zhong wrote:
DEBUG_KOBJECT_RELEASE helps to find the issue attached below.

After some investigation, it seems the reason is:
The mod-mkobj.kobj(a01600d0 below) is freed together with mod
itself in free_module(). However, its children still hold references to
it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
child(holders below) tries to decrease the reference count to its parent
in kobject_del(), BUG happens as it tries to access already freed 
memory.
   
   Ah, thanks for tracking this down.  I had seen this in my local testing,
   but wasn't able to figure out the offending code.
   
This patch tries to fix it by waiting for the mod-mkobj.kobj to be
really released in the module removing process (and some error code
paths).
   
   Nasty, we should just be freeing the structure in the release function,
   why doesn't that work?
  
  Hi Greg,
  
  It seems I didn't describe it clearly in the previous mail. I'm trying
  to do it better below: 
  
  mod-mkobj.kobj is embedded in module_kobject(not a pointer):
  struct module_kobject {
  struct kobject kobj;
  ...
  
  and allocated with the module memory. So we could see the parent below
  a01600d0 is between MODULES_VADDR (a000) and
  MODULES_END(ff00). 
  
  It seem to me that the mkobj.kobj is freed by module_free(mod,
  mod-module_core). 
 
 Ick, you are right.  If a kobject is being embedded in an object, it
 should control the lifespan of the object, not somewhere else like is
 happening here.
 
 The best solution for this is to make the kobject a pointer, not
 embedded in the structure, that will fix this issue, right?

Yes, I think so. I'll try to write a fix using this way, thanks for your
suggestion.

Thanks, Zhong

 
 thanks,
 
 greg k-h
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/