Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
>>> On Tue, Feb 5, 2008 at 4:58 PM, in message <[EMAIL PROTECTED]>, Daniel Walker <[EMAIL PROTECTED]> wrote: > On Tue, Feb 05, 2008 at 11:25:18AM -0700, Gregory Haskins wrote: >> @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct > root_domain *rd) >> cpu_clear(rq->cpu, old_rd->online); >> >> if (atomic_dec_and_test(_rd->refcount)) >> - kfree(old_rd); >> + reap = old_rd; > > Unrelated to the in atomic issue, I was wondering if this if statement > isn't true can the old_rd memory get leaked, or is it cleaned up > someplace else? Each RQ always has a reference to one root-domain and is thus represented by the rd->refcount. When the last RQ drops its reference to a particular instance, we free the structure. So this is the only place where we clean up, but it should also be the only place we need to (unless I am misunderstanding you?) Note that there is one exception: the default root-domain is never freed, which is why we initialize it with a refcount = 1. So it is theoretically possible to have this particular root-domain dangling with no RQs associated with it, but that is by design. Regards, -Greg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Tue, Feb 05, 2008 at 11:25:18AM -0700, Gregory Haskins wrote: > @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct > root_domain *rd) > cpu_clear(rq->cpu, old_rd->online); > > if (atomic_dec_and_test(_rd->refcount)) > - kfree(old_rd); > + reap = old_rd; Unrelated to the in atomic issue, I was wondering if this if statement isn't true can the old_rd memory get leaked, or is it cleaned up someplace else? Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
>>> On Tue, Feb 5, 2008 at 11:59 AM, in message <[EMAIL PROTECTED]>, Daniel Walker <[EMAIL PROTECTED]> wrote: > > I looked at the code a bit, and I'm not sure you need this complexity.. > Once you have replace the old_rq, there is no reason it needs to > protection of the run queue spinlock .. So you could just move the kfree > down below the spin_unlock_irqrestore() .. Here is a new version to address your observation: --- we cannot kfree while in_atomic() Signed-off-by: Gregory Haskins <[EMAIL PROTECTED]> diff --git a/kernel/sched.c b/kernel/sched.c index e6ad493..0978912 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6226,6 +6226,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) { unsigned long flags; const struct sched_class *class; + struct root_domain *reap = NULL; spin_lock_irqsave(>lock, flags); @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) cpu_clear(rq->cpu, old_rd->online); if (atomic_dec_and_test(_rd->refcount)) - kfree(old_rd); + reap = old_rd; } atomic_inc(>refcount); @@ -6257,6 +6258,10 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) } spin_unlock_irqrestore(>lock, flags); + + /* Don't try to free the memory while in-atomic() */ + if (unlikely(reap)) + kfree(reap); } > > Daniel > - > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
>>> On Tue, Feb 5, 2008 at 11:59 AM, in message <[EMAIL PROTECTED]>, Daniel Walker <[EMAIL PROTECTED]> wrote: > On Mon, Feb 04, 2008 at 10:02:12PM -0700, Gregory Haskins wrote: >> >>> On Mon, Feb 4, 2008 at 9:51 PM, in message >> <[EMAIL PROTECTED]>, Daniel Walker >> <[EMAIL PROTECTED]> wrote: >> > I get the following when I tried it, >> > >> > BUG: sleeping function called from invalid context bash(5126) at >> > kernel/rtmutex.c:638 >> > in_atomic():1 [0001], irqs_disabled():1 >> >> Hi Daniel, >> Can you try this patch and let me know if it fixes your problem? >> >> --- >> >> use rcu for root-domain kfree >> >> Signed-off-by: Gregory Haskins <[EMAIL PROTECTED]> >> >> diff --git a/kernel/sched.c b/kernel/sched.c >> index e6ad493..77e86c1 100644 >> --- a/kernel/sched.c >> +++ b/kernel/sched.c >> @@ -339,6 +339,7 @@ struct root_domain { >> atomic_t refcount; >> cpumask_t span; >> cpumask_t online; >> + struct rcu_head rcu; >> >> /* >> * The "RT overload" flag: it gets set if a CPU has more than >> @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct > sched_domain *parent) >> return 1; >> } >> >> +/* rcu callback to free a root-domain */ >> +static void rq_free_root(struct rcu_head *rcu) >> +{ >> + kfree(container_of(rcu, struct root_domain, rcu)); >> +} >> + > > I looked at the code a bit, and I'm not sure you need this complexity.. > Once you have replace the old_rq, there is no reason it needs to > protection of the run queue spinlock .. So you could just move the kfree > down below the spin_unlock_irqrestore() .. Indeed. When I looked last night at the stack, I thought the in_atomic was coming from further up in the trace. I see the issue now, thanks Daniel. (Anyone have a spare brown bag?) -Greg > > Daniel > - > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Mon, Feb 04, 2008 at 10:02:12PM -0700, Gregory Haskins wrote: > >>> On Mon, Feb 4, 2008 at 9:51 PM, in message > <[EMAIL PROTECTED]>, Daniel Walker > <[EMAIL PROTECTED]> wrote: > > I get the following when I tried it, > > > > BUG: sleeping function called from invalid context bash(5126) at > > kernel/rtmutex.c:638 > > in_atomic():1 [0001], irqs_disabled():1 > > Hi Daniel, > Can you try this patch and let me know if it fixes your problem? > > --- > > use rcu for root-domain kfree > > Signed-off-by: Gregory Haskins <[EMAIL PROTECTED]> > > diff --git a/kernel/sched.c b/kernel/sched.c > index e6ad493..77e86c1 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -339,6 +339,7 @@ struct root_domain { > atomic_t refcount; > cpumask_t span; > cpumask_t online; > + struct rcu_head rcu; > > /* > * The "RT overload" flag: it gets set if a CPU has more than > @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct > sched_domain *parent) > return 1; > } > > +/* rcu callback to free a root-domain */ > +static void rq_free_root(struct rcu_head *rcu) > +{ > + kfree(container_of(rcu, struct root_domain, rcu)); > +} > + I looked at the code a bit, and I'm not sure you need this complexity.. Once you have replace the old_rq, there is no reason it needs to protection of the run queue spinlock .. So you could just move the kfree down below the spin_unlock_irqrestore() .. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
>>> On Mon, Feb 4, 2008 at 9:51 PM, in message <[EMAIL PROTECTED]>, Daniel Walker <[EMAIL PROTECTED]> wrote: > On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: [snip] >> >> Also the first thing I tried was to bring CPU1 off-line. Thats the fastest >> way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung >> completely. After applying my earlier submitted patch, I was able to reproduce the hang you mentioned. I poked around in sysrq and it looked like a deadlock on a rt_mutex, so I turned on lockdep and it found: === [ INFO: possible circular locking dependency detected ] [ 2.6.24-rt1-rt #3 --- bash/4604 is trying to acquire lock: (events){--..}, at: [] cleanup_workqueue_thread+0x16/0x80 but task is already holding lock: (workqueue_mutex){--..}, at: [] workqueue_cpu_callback+0xe5/0x140 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (workqueue_mutex){--..}: [] __lock_acquire+0xf82/0x1090 [] lock_acquire+0x57/0x80 [] workqueue_cpu_callback+0xe5/0x140 [] _mutex_lock+0x28/0x40 [] workqueue_cpu_callback+0xe5/0x140 [] notifier_call_chain+0x45/0x90 [] __raw_notifier_call_chain+0x9/0x10 [] raw_notifier_call_chain+0x11/0x20 [] _cpu_down+0x97/0x2d0 [] cpu_down+0x25/0x60 [] cpu_down+0x38/0x60 [] store_online+0x49/0xa0 [] sysdev_store+0x24/0x30 [] sysfs_write_file+0xcf/0x140 [] vfs_write+0xe5/0x1a0 [] sys_write+0x53/0x90 [] system_call+0x7e/0x83 [] 0x -> #4 (cache_chain_mutex){--..}: [] __lock_acquire+0xf82/0x1090 [] lock_acquire+0x57/0x80 [] kmem_cache_create+0x6a/0x480 [] _mutex_lock+0x28/0x40 [] kmem_cache_create+0x6a/0x480 [] __rcu_read_unlock+0x96/0xb0 [] fib_hash_init+0xa4/0xe0 [] fib_new_table+0x35/0x70 [] fib_magic+0x91/0x100 [] fib_add_ifaddr+0x73/0x170 [] fib_inetaddr_event+0x4b/0x260 [] notifier_call_chain+0x45/0x90 [] __blocking_notifier_call_chain+0x5e/0x90 [] blocking_notifier_call_chain+0x11/0x20 [] __inet_insert_ifa+0xd4/0x170 [] inet_insert_ifa+0xd/0x10 [] inetdev_event+0x45a/0x510 [] fib_rules_event+0x6d/0x160 [] notifier_call_chain+0x45/0x90 [] __raw_notifier_call_chain+0x9/0x10 [] raw_notifier_call_chain+0x11/0x20 [] call_netdevice_notifiers+0x16/0x20 [] dev_open+0x8d/0xa0 [] dev_change_flags+0x99/0x1b0 [] devinet_ioctl+0x5ad/0x760 [] dev_ioctl+0x4ba/0x590 [] trace_hardirqs_on+0xd/0x10 [] inet_ioctl+0x5d/0x80 [] sock_ioctl+0xd1/0x260 [] do_ioctl+0x34/0xa0 [] vfs_ioctl+0x79/0x2f0 [] trace_hardirqs_on_thunk+0x3a/0x3f [] sys_ioctl+0x82/0xa0 [] system_call+0x7e/0x83 [] 0x -> #3 ((inetaddr_chain).rwsem){..--}: [] __lock_acquire+0xf82/0x1090 [] lock_acquire+0x57/0x80 [] rt_down_read+0xb/0x10 [] __rt_down_read+0x29/0x80 [] rt_down_read+0xb/0x10 [] __blocking_notifier_call_chain+0x48/0x90 [] blocking_notifier_call_chain+0x11/0x20 [] __inet_insert_ifa+0xd4/0x170 [] inet_insert_ifa+0xd/0x10 [] inetdev_event+0x45a/0x510 [] fib_rules_event+0x6d/0x160 [] notifier_call_chain+0x45/0x90 [] __raw_notifier_call_chain+0x9/0x10 [] raw_notifier_call_chain+0x11/0x20 [] call_netdevice_notifiers+0x16/0x20 [] dev_open+0x8d/0xa0 [] dev_change_flags+0x99/0x1b0 [] devinet_ioctl+0x5ad/0x760 [] dev_ioctl+0x4ba/0x590 [] trace_hardirqs_on+0xd/0x10 [] inet_ioctl+0x5d/0x80 [] sock_ioctl+0xd1/0x260 [] do_ioctl+0x34/0xa0 [] vfs_ioctl+0x79/0x2f0 [] trace_hardirqs_on_thunk+0x3a/0x3f [] sys_ioctl+0x82/0xa0 [] system_call+0x7e/0x83 [] 0x -> #2 (rtnl_mutex){--..}: [] __lock_acquire+0xf82/0x1090 [] lock_acquire+0x57/0x80 [] rtnl_lock+0x10/0x20 [] _mutex_lock+0x28/0x40 [] rtnl_lock+0x10/0x20 [] linkwatch_event+0x9/0x40 [] run_workqueue+0x221/0x2f0 [] linkwatch_event+0x0/0x40 [] worker_thread+0xd3/0x140 [] autoremove_wake_function+0x0/0x40 [] worker_thread+0x0/0x140 [] kthread+0x4d/0x80 [] child_rip+0xa/0x12 [] restore_args+0x0/0x30 [] kthread+0x0/0x80 [] child_rip+0x0/0x12 [] 0x -> #1 ((linkwatch_work).work){--..}: [] __lock_acquire+0xf82/0x1090 [] lock_acquire+0x57/0x80 [] run_workqueue+0x1ca/0x2f0 [] run_workqueue+0x21a/0x2f0 [] linkwatch_event+0x0/0x40 [] worker_thread+0xd3/0x140 [] autoremove_wake_function+0x0/0x40 []
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Mon, Feb 4, 2008 at 9:51 PM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: [snip] Also the first thing I tried was to bring CPU1 off-line. Thats the fastest way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung completely. After applying my earlier submitted patch, I was able to reproduce the hang you mentioned. I poked around in sysrq and it looked like a deadlock on a rt_mutex, so I turned on lockdep and it found: === [ INFO: possible circular locking dependency detected ] [ 2.6.24-rt1-rt #3 --- bash/4604 is trying to acquire lock: (events){--..}, at: [802537b6] cleanup_workqueue_thread+0x16/0x80 but task is already holding lock: (workqueue_mutex){--..}, at: [80254615] workqueue_cpu_callback+0xe5/0x140 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #5 (workqueue_mutex){--..}: [80266752] __lock_acquire+0xf82/0x1090 [802668b7] lock_acquire+0x57/0x80 [80254615] workqueue_cpu_callback+0xe5/0x140 [80486818] _mutex_lock+0x28/0x40 [80254615] workqueue_cpu_callback+0xe5/0x140 [8048a575] notifier_call_chain+0x45/0x90 [8025d079] __raw_notifier_call_chain+0x9/0x10 [8025d091] raw_notifier_call_chain+0x11/0x20 [8026d157] _cpu_down+0x97/0x2d0 [8026d3b5] cpu_down+0x25/0x60 [8026d3c8] cpu_down+0x38/0x60 [803d6719] store_online+0x49/0xa0 [803d2774] sysdev_store+0x24/0x30 [8031279f] sysfs_write_file+0xcf/0x140 [802c0005] vfs_write+0xe5/0x1a0 [802c0733] sys_write+0x53/0x90 [8020c4fe] system_call+0x7e/0x83 [] 0x - #4 (cache_chain_mutex){--..}: [80266752] __lock_acquire+0xf82/0x1090 [802668b7] lock_acquire+0x57/0x80 [802bb7fa] kmem_cache_create+0x6a/0x480 [80486818] _mutex_lock+0x28/0x40 [802bb7fa] kmem_cache_create+0x6a/0x480 [802872a6] __rcu_read_unlock+0x96/0xb0 [8046b824] fib_hash_init+0xa4/0xe0 [80467ee5] fib_new_table+0x35/0x70 [80467fb1] fib_magic+0x91/0x100 [80468093] fib_add_ifaddr+0x73/0x170 [8046829b] fib_inetaddr_event+0x4b/0x260 [8048a575] notifier_call_chain+0x45/0x90 [8025d2ce] __blocking_notifier_call_chain+0x5e/0x90 [8025d311] blocking_notifier_call_chain+0x11/0x20 [8045f714] __inet_insert_ifa+0xd4/0x170 [8045f7bd] inet_insert_ifa+0xd/0x10 [8046083a] inetdev_event+0x45a/0x510 [8041ee4d] fib_rules_event+0x6d/0x160 [8048a575] notifier_call_chain+0x45/0x90 [8025d079] __raw_notifier_call_chain+0x9/0x10 [8025d091] raw_notifier_call_chain+0x11/0x20 [8040f466] call_netdevice_notifiers+0x16/0x20 [80410f6d] dev_open+0x8d/0xa0 [8040f5e9] dev_change_flags+0x99/0x1b0 [80460ffd] devinet_ioctl+0x5ad/0x760 [80410d6a] dev_ioctl+0x4ba/0x590 [8026523d] trace_hardirqs_on+0xd/0x10 [8046162d] inet_ioctl+0x5d/0x80 [80400f21] sock_ioctl+0xd1/0x260 [802ce154] do_ioctl+0x34/0xa0 [802ce239] vfs_ioctl+0x79/0x2f0 [80485f30] trace_hardirqs_on_thunk+0x3a/0x3f [802ce532] sys_ioctl+0x82/0xa0 [8020c4fe] system_call+0x7e/0x83 [] 0x - #3 ((inetaddr_chain).rwsem){..--}: [80266752] __lock_acquire+0xf82/0x1090 [802668b7] lock_acquire+0x57/0x80 [8026ca9b] rt_down_read+0xb/0x10 [8026ca29] __rt_down_read+0x29/0x80 [8026ca9b] rt_down_read+0xb/0x10 [8025d2b8] __blocking_notifier_call_chain+0x48/0x90 [8025d311] blocking_notifier_call_chain+0x11/0x20 [8045f714] __inet_insert_ifa+0xd4/0x170 [8045f7bd] inet_insert_ifa+0xd/0x10 [8046083a] inetdev_event+0x45a/0x510 [8041ee4d] fib_rules_event+0x6d/0x160 [8048a575] notifier_call_chain+0x45/0x90 [8025d079] __raw_notifier_call_chain+0x9/0x10 [8025d091] raw_notifier_call_chain+0x11/0x20 [8040f466] call_netdevice_notifiers+0x16/0x20 [80410f6d] dev_open+0x8d/0xa0 [8040f5e9] dev_change_flags+0x99/0x1b0 [80460ffd] devinet_ioctl+0x5ad/0x760 [80410d6a] dev_ioctl+0x4ba/0x590 [8026523d]
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Mon, Feb 04, 2008 at 10:02:12PM -0700, Gregory Haskins wrote: On Mon, Feb 4, 2008 at 9:51 PM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [0001], irqs_disabled():1 Hi Daniel, Can you try this patch and let me know if it fixes your problem? --- use rcu for root-domain kfree Signed-off-by: Gregory Haskins [EMAIL PROTECTED] diff --git a/kernel/sched.c b/kernel/sched.c index e6ad493..77e86c1 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -339,6 +339,7 @@ struct root_domain { atomic_t refcount; cpumask_t span; cpumask_t online; + struct rcu_head rcu; /* * The RT overload flag: it gets set if a CPU has more than @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct sched_domain *parent) return 1; } +/* rcu callback to free a root-domain */ +static void rq_free_root(struct rcu_head *rcu) +{ + kfree(container_of(rcu, struct root_domain, rcu)); +} + I looked at the code a bit, and I'm not sure you need this complexity.. Once you have replace the old_rq, there is no reason it needs to protection of the run queue spinlock .. So you could just move the kfree down below the spin_unlock_irqrestore() .. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Tue, Feb 5, 2008 at 11:59 AM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: On Mon, Feb 04, 2008 at 10:02:12PM -0700, Gregory Haskins wrote: On Mon, Feb 4, 2008 at 9:51 PM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [0001], irqs_disabled():1 Hi Daniel, Can you try this patch and let me know if it fixes your problem? --- use rcu for root-domain kfree Signed-off-by: Gregory Haskins [EMAIL PROTECTED] diff --git a/kernel/sched.c b/kernel/sched.c index e6ad493..77e86c1 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -339,6 +339,7 @@ struct root_domain { atomic_t refcount; cpumask_t span; cpumask_t online; + struct rcu_head rcu; /* * The RT overload flag: it gets set if a CPU has more than @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct sched_domain *parent) return 1; } +/* rcu callback to free a root-domain */ +static void rq_free_root(struct rcu_head *rcu) +{ + kfree(container_of(rcu, struct root_domain, rcu)); +} + I looked at the code a bit, and I'm not sure you need this complexity.. Once you have replace the old_rq, there is no reason it needs to protection of the run queue spinlock .. So you could just move the kfree down below the spin_unlock_irqrestore() .. Indeed. When I looked last night at the stack, I thought the in_atomic was coming from further up in the trace. I see the issue now, thanks Daniel. (Anyone have a spare brown bag?) -Greg Daniel - To unsubscribe from this list: send the line unsubscribe linux-rt-users in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Tue, Feb 5, 2008 at 11:59 AM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: I looked at the code a bit, and I'm not sure you need this complexity.. Once you have replace the old_rq, there is no reason it needs to protection of the run queue spinlock .. So you could just move the kfree down below the spin_unlock_irqrestore() .. Here is a new version to address your observation: --- we cannot kfree while in_atomic() Signed-off-by: Gregory Haskins [EMAIL PROTECTED] diff --git a/kernel/sched.c b/kernel/sched.c index e6ad493..0978912 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6226,6 +6226,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) { unsigned long flags; const struct sched_class *class; + struct root_domain *reap = NULL; spin_lock_irqsave(rq-lock, flags); @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) cpu_clear(rq-cpu, old_rd-online); if (atomic_dec_and_test(old_rd-refcount)) - kfree(old_rd); + reap = old_rd; } atomic_inc(rd-refcount); @@ -6257,6 +6258,10 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) } spin_unlock_irqrestore(rq-lock, flags); + + /* Don't try to free the memory while in-atomic() */ + if (unlikely(reap)) + kfree(reap); } Daniel - To unsubscribe from this list: send the line unsubscribe linux-rt-users in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Tue, Feb 05, 2008 at 11:25:18AM -0700, Gregory Haskins wrote: @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) cpu_clear(rq-cpu, old_rd-online); if (atomic_dec_and_test(old_rd-refcount)) - kfree(old_rd); + reap = old_rd; Unrelated to the in atomic issue, I was wondering if this if statement isn't true can the old_rd memory get leaked, or is it cleaned up someplace else? Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Tue, Feb 5, 2008 at 4:58 PM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: On Tue, Feb 05, 2008 at 11:25:18AM -0700, Gregory Haskins wrote: @@ -6241,7 +6242,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) cpu_clear(rq-cpu, old_rd-online); if (atomic_dec_and_test(old_rd-refcount)) - kfree(old_rd); + reap = old_rd; Unrelated to the in atomic issue, I was wondering if this if statement isn't true can the old_rd memory get leaked, or is it cleaned up someplace else? Each RQ always has a reference to one root-domain and is thus represented by the rd-refcount. When the last RQ drops its reference to a particular instance, we free the structure. So this is the only place where we clean up, but it should also be the only place we need to (unless I am misunderstanding you?) Note that there is one exception: the default root-domain is never freed, which is why we initialize it with a refcount = 1. So it is theoretically possible to have this particular root-domain dangling with no RQs associated with it, but that is by design. Regards, -Greg -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
>>> On Mon, Feb 4, 2008 at 9:51 PM, in message <[EMAIL PROTECTED]>, Daniel Walker <[EMAIL PROTECTED]> wrote: > I get the following when I tried it, > > BUG: sleeping function called from invalid context bash(5126) at > kernel/rtmutex.c:638 > in_atomic():1 [0001], irqs_disabled():1 Hi Daniel, Can you try this patch and let me know if it fixes your problem? --- use rcu for root-domain kfree Signed-off-by: Gregory Haskins <[EMAIL PROTECTED]> diff --git a/kernel/sched.c b/kernel/sched.c index e6ad493..77e86c1 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -339,6 +339,7 @@ struct root_domain { atomic_t refcount; cpumask_t span; cpumask_t online; + struct rcu_head rcu; /* * The "RT overload" flag: it gets set if a CPU has more than @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct sched_domain *parent) return 1; } +/* rcu callback to free a root-domain */ +static void rq_free_root(struct rcu_head *rcu) +{ + kfree(container_of(rcu, struct root_domain, rcu)); +} + static void rq_attach_root(struct rq *rq, struct root_domain *rd) { unsigned long flags; @@ -6241,7 +6248,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) cpu_clear(rq->cpu, old_rd->online); if (atomic_dec_and_test(_rd->refcount)) - kfree(old_rd); + call_rcu(_rd->rcu, rq_free_root); } atomic_inc(>refcount); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
Daniel Walker wrote: > On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: >> This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel >> suggest for me >> to check out latest RT kernels. So I did or at least tried to and >> immediately spotted a couple >> of issues. >> >> The machine I'm running it on is: >> HP xw9300, Dual Opteron, NUMA >> >> It looks like with -rt kernel IRQ affinity masks are ignored on that >> system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the >> interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. > > I tried this, and it works according to /proc/interrupts .. Are you > looking at the interrupt threads affinity ? Nope. I'm looking at the /proc/interrupts. ie The interrupt count keeps incrementing for cpu1 even though affinity mask is set to 1. IRQ thread affinity was btw set to 3 which is probably wrong. To clarify, by default after reboot: - IRQ affinity set 3, IRQ thread affinity set to 3 - User writes 1 into /proc/irq/N/smp_affinity - IRQ affinity is now set to 1, IRQ thread affinity is still set to 3 It'd still work I guess but does not seem right. Ideally IRQ thread affinity should have change as well. We could of course just have some user-space tool that adjusts both. Looks like Greg already replied to the cpu hotplug issue. For me it did not oops. Just got stuck probably because it could not move an IRQ due to broken IRQ affinity logic. Max -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
Hi Daniel, See inline... >>> On Mon, Feb 4, 2008 at 9:51 PM, in message <[EMAIL PROTECTED]>, Daniel Walker <[EMAIL PROTECTED]> wrote: > On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: >> This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel > suggest for me >> to check out latest RT kernels. So I did or at least tried to and > immediately spotted a couple >> of issues. >> >> The machine I'm running it on is: >> HP xw9300, Dual Opteron, NUMA >> >> It looks like with -rt kernel IRQ affinity masks are ignored on that >> system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the >> interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. > > I tried this, and it works according to /proc/interrupts .. Are you > looking at the interrupt threads affinity? > >> Also the first thing I tried was to bring CPU1 off-line. Thats the fastest >> way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung >> completely. It also managed to mess up my ext3 filesystem to the point >> where it required manual fsck (have not see that for a couple of >> years now). I tried the same thing (ie echo 0 > >> /sys/devices/cpu/cpu1/online) from the console. It hang again with the >> message that looked something like: >> CPU1 is now off-line >> Thread IRQ-23 is on CPU1 ... > > I get the following when I tried it, > > BUG: sleeping function called from invalid context bash(5126) at > kernel/rtmutex.c:638 > in_atomic():1 [0001], irqs_disabled():1 > Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1 > [] show_trace_log_lvl+0x1d/0x3a > [] show_trace+0x12/0x14 > [] dump_stack+0x6c/0x72 > [] __might_sleep+0xe8/0xef > [] __rt_spin_lock+0x24/0x59 > [] rt_spin_lock+0x8/0xa > [] kfree+0x2c/0x8d Doh! This is my bug. Ill have to come up with a good way to free that memory under atomic, or do this another way. Stay tuned. > [] rq_attach_root+0x67/0xba > [] cpu_attach_domain+0x2b6/0x2f7 > [] detach_destroy_domains+0x23/0x37 > [] update_sched_domains+0x2d/0x40 > [] notifier_call_chain+0x2b/0x55 > [] __raw_notifier_call_chain+0x19/0x1e > [] _cpu_down+0x84/0x24c > [] cpu_down+0x28/0x3a > [] store_online+0x27/0x5a > [] sysdev_store+0x20/0x25 > [] sysfs_write_file+0xad/0xde > [] vfs_write+0x82/0xb8 > [] sys_write+0x3d/0x61 > [] sysenter_past_esp+0x5f/0x85 > === > --- > | preempt count: 0001 ] > | 1-level deep critical section nesting: > > .. [] __spin_lock_irqsave+0x14/0x3b > .[] .. ( <= rq_attach_root+0x12/0xba) > > Which is clearly a problem .. > > (I added linux-rt-users to the CC) > > Daniel > - > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: > This is just an FYI. As part of the "Isolated CPU extensions" thread Daniel > suggest for me > to check out latest RT kernels. So I did or at least tried to and immediately > spotted a couple > of issues. > > The machine I'm running it on is: > HP xw9300, Dual Opteron, NUMA > > It looks like with -rt kernel IRQ affinity masks are ignored on that > system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the > interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. I tried this, and it works according to /proc/interrupts .. Are you looking at the interrupt threads affinity? > Also the first thing I tried was to bring CPU1 off-line. Thats the fastest > way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung > completely. It also managed to mess up my ext3 filesystem to the point > where it required manual fsck (have not see that for a couple of > years now). I tried the same thing (ie echo 0 > > /sys/devices/cpu/cpu1/online) from the console. It hang again with the > message that looked something like: > CPU1 is now off-line > Thread IRQ-23 is on CPU1 ... I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [0001], irqs_disabled():1 Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1 [] show_trace_log_lvl+0x1d/0x3a [] show_trace+0x12/0x14 [] dump_stack+0x6c/0x72 [] __might_sleep+0xe8/0xef [] __rt_spin_lock+0x24/0x59 [] rt_spin_lock+0x8/0xa [] kfree+0x2c/0x8d [] rq_attach_root+0x67/0xba [] cpu_attach_domain+0x2b6/0x2f7 [] detach_destroy_domains+0x23/0x37 [] update_sched_domains+0x2d/0x40 [] notifier_call_chain+0x2b/0x55 [] __raw_notifier_call_chain+0x19/0x1e [] _cpu_down+0x84/0x24c [] cpu_down+0x28/0x3a [] store_online+0x27/0x5a [] sysdev_store+0x20/0x25 [] sysfs_write_file+0xad/0xde [] vfs_write+0x82/0xb8 [] sys_write+0x3d/0x61 [] sysenter_past_esp+0x5f/0x85 === --- | preempt count: 0001 ] | 1-level deep critical section nesting: .. [] __spin_lock_irqsave+0x14/0x3b .[] .. ( <= rq_attach_root+0x12/0xba) Which is clearly a problem .. (I added linux-rt-users to the CC) Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: This is just an FYI. As part of the Isolated CPU extensions thread Daniel suggest for me to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple of issues. The machine I'm running it on is: HP xw9300, Dual Opteron, NUMA It looks like with -rt kernel IRQ affinity masks are ignored on that system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. I tried this, and it works according to /proc/interrupts .. Are you looking at the interrupt threads affinity? Also the first thing I tried was to bring CPU1 off-line. Thats the fastest way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung completely. It also managed to mess up my ext3 filesystem to the point where it required manual fsck (have not see that for a couple of years now). I tried the same thing (ie echo 0 /sys/devices/cpu/cpu1/online) from the console. It hang again with the message that looked something like: CPU1 is now off-line Thread IRQ-23 is on CPU1 ... I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [0001], irqs_disabled():1 Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1 [c010506b] show_trace_log_lvl+0x1d/0x3a [c01059cd] show_trace+0x12/0x14 [c0106151] dump_stack+0x6c/0x72 [c011d153] __might_sleep+0xe8/0xef [c03b2326] __rt_spin_lock+0x24/0x59 [c03b2363] rt_spin_lock+0x8/0xa [c0165b2f] kfree+0x2c/0x8d [c011eacb] rq_attach_root+0x67/0xba [c01209ae] cpu_attach_domain+0x2b6/0x2f7 [c0120a12] detach_destroy_domains+0x23/0x37 [c0121368] update_sched_domains+0x2d/0x40 [c013b482] notifier_call_chain+0x2b/0x55 [c013b4d9] __raw_notifier_call_chain+0x19/0x1e [c01420d3] _cpu_down+0x84/0x24c [c01422c3] cpu_down+0x28/0x3a [c029f59e] store_online+0x27/0x5a [c029c9dc] sysdev_store+0x20/0x25 [c019a695] sysfs_write_file+0xad/0xde [c0169929] vfs_write+0x82/0xb8 [c0169e2a] sys_write+0x3d/0x61 [c0104072] sysenter_past_esp+0x5f/0x85 === --- | preempt count: 0001 ] | 1-level deep critical section nesting: .. [c03b25e2] __spin_lock_irqsave+0x14/0x3b .[c011ea76] .. ( = rq_attach_root+0x12/0xba) Which is clearly a problem .. (I added linux-rt-users to the CC) Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
Hi Daniel, See inline... On Mon, Feb 4, 2008 at 9:51 PM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: This is just an FYI. As part of the Isolated CPU extensions thread Daniel suggest for me to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple of issues. The machine I'm running it on is: HP xw9300, Dual Opteron, NUMA It looks like with -rt kernel IRQ affinity masks are ignored on that system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. I tried this, and it works according to /proc/interrupts .. Are you looking at the interrupt threads affinity? Also the first thing I tried was to bring CPU1 off-line. Thats the fastest way to get irqs, soft-irqs, timers, etc of a CPU. But the box hung completely. It also managed to mess up my ext3 filesystem to the point where it required manual fsck (have not see that for a couple of years now). I tried the same thing (ie echo 0 /sys/devices/cpu/cpu1/online) from the console. It hang again with the message that looked something like: CPU1 is now off-line Thread IRQ-23 is on CPU1 ... I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [0001], irqs_disabled():1 Pid: 5126, comm: bash Not tainted 2.6.24-rt1 #1 [c010506b] show_trace_log_lvl+0x1d/0x3a [c01059cd] show_trace+0x12/0x14 [c0106151] dump_stack+0x6c/0x72 [c011d153] __might_sleep+0xe8/0xef [c03b2326] __rt_spin_lock+0x24/0x59 [c03b2363] rt_spin_lock+0x8/0xa [c0165b2f] kfree+0x2c/0x8d Doh! This is my bug. Ill have to come up with a good way to free that memory under atomic, or do this another way. Stay tuned. [c011eacb] rq_attach_root+0x67/0xba [c01209ae] cpu_attach_domain+0x2b6/0x2f7 [c0120a12] detach_destroy_domains+0x23/0x37 [c0121368] update_sched_domains+0x2d/0x40 [c013b482] notifier_call_chain+0x2b/0x55 [c013b4d9] __raw_notifier_call_chain+0x19/0x1e [c01420d3] _cpu_down+0x84/0x24c [c01422c3] cpu_down+0x28/0x3a [c029f59e] store_online+0x27/0x5a [c029c9dc] sysdev_store+0x20/0x25 [c019a695] sysfs_write_file+0xad/0xde [c0169929] vfs_write+0x82/0xb8 [c0169e2a] sys_write+0x3d/0x61 [c0104072] sysenter_past_esp+0x5f/0x85 === --- | preempt count: 0001 ] | 1-level deep critical section nesting: .. [c03b25e2] __spin_lock_irqsave+0x14/0x3b .[c011ea76] .. ( = rq_attach_root+0x12/0xba) Which is clearly a problem .. (I added linux-rt-users to the CC) Daniel - To unsubscribe from this list: send the line unsubscribe linux-rt-users in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
Daniel Walker wrote: On Mon, Feb 04, 2008 at 03:35:13PM -0800, Max Krasnyanskiy wrote: This is just an FYI. As part of the Isolated CPU extensions thread Daniel suggest for me to check out latest RT kernels. So I did or at least tried to and immediately spotted a couple of issues. The machine I'm running it on is: HP xw9300, Dual Opteron, NUMA It looks like with -rt kernel IRQ affinity masks are ignored on that system. ie I write 1 to lets say /proc/irq/23/smp_affinity but the interrupts keep coming to CPU1. Vanilla 2.6.24 does not have that issue. I tried this, and it works according to /proc/interrupts .. Are you looking at the interrupt threads affinity ? Nope. I'm looking at the /proc/interrupts. ie The interrupt count keeps incrementing for cpu1 even though affinity mask is set to 1. IRQ thread affinity was btw set to 3 which is probably wrong. To clarify, by default after reboot: - IRQ affinity set 3, IRQ thread affinity set to 3 - User writes 1 into /proc/irq/N/smp_affinity - IRQ affinity is now set to 1, IRQ thread affinity is still set to 3 It'd still work I guess but does not seem right. Ideally IRQ thread affinity should have change as well. We could of course just have some user-space tool that adjusts both. Looks like Greg already replied to the cpu hotplug issue. For me it did not oops. Just got stuck probably because it could not move an IRQ due to broken IRQ affinity logic. Max -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CPU hotplug and IRQ affinity with 2.6.24-rt1
On Mon, Feb 4, 2008 at 9:51 PM, in message [EMAIL PROTECTED], Daniel Walker [EMAIL PROTECTED] wrote: I get the following when I tried it, BUG: sleeping function called from invalid context bash(5126) at kernel/rtmutex.c:638 in_atomic():1 [0001], irqs_disabled():1 Hi Daniel, Can you try this patch and let me know if it fixes your problem? --- use rcu for root-domain kfree Signed-off-by: Gregory Haskins [EMAIL PROTECTED] diff --git a/kernel/sched.c b/kernel/sched.c index e6ad493..77e86c1 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -339,6 +339,7 @@ struct root_domain { atomic_t refcount; cpumask_t span; cpumask_t online; + struct rcu_head rcu; /* * The RT overload flag: it gets set if a CPU has more than @@ -6222,6 +6223,12 @@ sd_parent_degenerate(struct sched_domain *sd, struct sched_domain *parent) return 1; } +/* rcu callback to free a root-domain */ +static void rq_free_root(struct rcu_head *rcu) +{ + kfree(container_of(rcu, struct root_domain, rcu)); +} + static void rq_attach_root(struct rq *rq, struct root_domain *rd) { unsigned long flags; @@ -6241,7 +6248,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd) cpu_clear(rq-cpu, old_rd-online); if (atomic_dec_and_test(old_rd-refcount)) - kfree(old_rd); + call_rcu(old_rd-rcu, rq_free_root); } atomic_inc(rd-refcount); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/