[ 37/45] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
3.10-stable review patch. If anyone has any objections, please let me know. -- From: Stephen Boyd commit 40fea92ffb5fa0ef26d10ae0fe5688bc8e61c791 upstream. pm_qos_update_request_timeout() updates a qos and then schedules a delayed work item to bring the qos back down to the default after the timeout. When the work item runs, pm_qos_work_fn() will call pm_qos_update_request() and deadlock because it tries to cancel itself via cancel_delayed_work_sync(). Future callers of that qos will also hang waiting to cancel the work that is canceling itself. Let's extract the little bit of code that does the real work of pm_qos_update_request() and call it from the work function so that we don't deadlock. Before ed1ac6e (PM: don't use [delayed_]work_pending()) this didn't happen because the work function wouldn't try to cancel itself. [backport to 3.10 - gregkh] Signed-off-by: Stephen Boyd Reviewed-by: Tejun Heo Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman --- kernel/power/qos.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -293,6 +293,15 @@ int pm_qos_request_active(struct pm_qos_ } EXPORT_SYMBOL_GPL(pm_qos_request_active); +static void __pm_qos_update_request(struct pm_qos_request *req, + s32 new_value) +{ + if (new_value != req->node.prio) + pm_qos_update_target( + pm_qos_array[req->pm_qos_class]->constraints, + &req->node, PM_QOS_UPDATE_REQ, new_value); +} + /** * pm_qos_work_fn - the timeout handler of pm_qos_update_request_timeout * @work: work struct for the delayed work (timeout) @@ -305,7 +314,7 @@ static void pm_qos_work_fn(struct work_s struct pm_qos_request, work); - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); + __pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); } /** @@ -365,6 +374,8 @@ void pm_qos_update_request(struct pm_qos pm_qos_update_target( pm_qos_array[req->pm_qos_class]->constraints, &req->node, PM_QOS_UPDATE_REQ, new_value); + + __pm_qos_update_request(req, new_value); } EXPORT_SYMBOL_GPL(pm_qos_update_request); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
On Tuesday, August 13, 2013 06:13:25 PM Tejun Heo wrote: > Hello, > > On Tue, Aug 13, 2013 at 02:12:40PM -0700, Stephen Boyd wrote: > > @@ -308,7 +319,7 @@ static void pm_qos_work_fn(struct work_struct *work) > > struct pm_qos_request, > > work); > > > > - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); > > + __pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); > > Maybe a short comment explaining why this is different would be nice? > Other than that, > > Reviewed-by: Tejun Heo Thanks guys, I'm going to push that as a fix for 3.11-rc6 and stable. Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
Hello, On Tue, Aug 13, 2013 at 02:12:40PM -0700, Stephen Boyd wrote: > @@ -308,7 +319,7 @@ static void pm_qos_work_fn(struct work_struct *work) > struct pm_qos_request, > work); > > - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); > + __pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); Maybe a short comment explaining why this is different would be nice? Other than that, Reviewed-by: Tejun Heo Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
pm_qos_update_request_timeout() updates a qos and then schedules a delayed work item to bring the qos back down to the default after the timeout. When the work item runs, pm_qos_work_fn() will call pm_qos_update_request() and deadlock because it tries to cancel itself via cancel_delayed_work_sync(). Future callers of that qos will also hang waiting to cancel the work that is canceling itself. Let's extract the little bit of code that does the real work of pm_qos_update_request() and call it from the work function so that we don't deadlock. Cc: Tejun Heo Signed-off-by: Stephen Boyd --- kernel/power/qos.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/kernel/power/qos.c b/kernel/power/qos.c index 06fe285..a394297 100644 --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -296,6 +296,17 @@ int pm_qos_request_active(struct pm_qos_request *req) } EXPORT_SYMBOL_GPL(pm_qos_request_active); +static void __pm_qos_update_request(struct pm_qos_request *req, + s32 new_value) +{ + trace_pm_qos_update_request(req->pm_qos_class, new_value); + + if (new_value != req->node.prio) + pm_qos_update_target( + pm_qos_array[req->pm_qos_class]->constraints, + &req->node, PM_QOS_UPDATE_REQ, new_value); +} + /** * pm_qos_work_fn - the timeout handler of pm_qos_update_request_timeout * @work: work struct for the delayed work (timeout) @@ -308,7 +319,7 @@ static void pm_qos_work_fn(struct work_struct *work) struct pm_qos_request, work); - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); + __pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); } /** @@ -364,12 +375,7 @@ void pm_qos_update_request(struct pm_qos_request *req, } cancel_delayed_work_sync(&req->work); - - trace_pm_qos_update_request(req->pm_qos_class, new_value); - if (new_value != req->node.prio) - pm_qos_update_target( - pm_qos_array[req->pm_qos_class]->constraints, - &req->node, PM_QOS_UPDATE_REQ, new_value); + __pm_qos_update_request(req, new_value); } EXPORT_SYMBOL_GPL(pm_qos_update_request); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
On 08/13, Rafael J. Wysocki wrote: > On Tuesday, August 13, 2013 01:01:46 PM Tejun Heo wrote: > > Hello, > > > > On Tue, Aug 13, 2013 at 09:46:26AM -0700, Stephen Boyd wrote: > > > >> + if (PM_QOS_DEFAULT_VALUE != req->node.prio) > > > >> + pm_qos_update_target( > > > >> + > > > >> pm_qos_array[req->pm_qos_class]->constraints, > > > >> + &req->node, PM_QOS_UPDATE_REQ, > > > >> + PM_QOS_DEFAULT_VALUE); > > > > Maybe it'd be cleaner to add a param or internal variant of > > > > pm_qos_update_request()? > > > > > > Maybe, but I was trying to make a minimal fix here. > > > > Hmmm it just looks like things can easily get out of sync with the > > complex function call. > > Yes, that's just duplicated code. > > > I don't think it'll be too invasive if you introduce an internal variant > > which doesn't do the canceling. Rafael, what do you think? > > I'd move the part of pm_qos_update_request() below the > cancel_delayed_work_sync() to a separate static function that'd be > called from two places. > Ok I will throw this all into one patch and resend. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
On Tuesday, August 13, 2013 01:01:46 PM Tejun Heo wrote: > Hello, > > On Tue, Aug 13, 2013 at 09:46:26AM -0700, Stephen Boyd wrote: > > >> +if (PM_QOS_DEFAULT_VALUE != req->node.prio) > > >> +pm_qos_update_target( > > >> + > > >> pm_qos_array[req->pm_qos_class]->constraints, > > >> +&req->node, PM_QOS_UPDATE_REQ, > > >> +PM_QOS_DEFAULT_VALUE); > > > Maybe it'd be cleaner to add a param or internal variant of > > > pm_qos_update_request()? > > > > Maybe, but I was trying to make a minimal fix here. > > Hmmm it just looks like things can easily get out of sync with the > complex function call. Yes, that's just duplicated code. > I don't think it'll be too invasive if you introduce an internal variant > which doesn't do the canceling. Rafael, what do you think? I'd move the part of pm_qos_update_request() below the cancel_delayed_work_sync() to a separate static function that'd be called from two places. Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
Hello, On Tue, Aug 13, 2013 at 09:46:26AM -0700, Stephen Boyd wrote: > >> + if (PM_QOS_DEFAULT_VALUE != req->node.prio) > >> + pm_qos_update_target( > >> + pm_qos_array[req->pm_qos_class]->constraints, > >> + &req->node, PM_QOS_UPDATE_REQ, > >> + PM_QOS_DEFAULT_VALUE); > > Maybe it'd be cleaner to add a param or internal variant of > > pm_qos_update_request()? > > Maybe, but I was trying to make a minimal fix here. Hmmm it just looks like things can easily get out of sync with the complex function call. I don't think it'll be too invasive if you introduce an internal variant which doesn't do the canceling. Rafael, what do you think? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
On 08/13/13 09:43, Tejun Heo wrote: > Hello, Stephen. > > On Thu, Aug 08, 2013 at 01:13:57PM -0700, Stephen Boyd wrote: >> pm_qos_update_request_timeout() updates a qos and then schedules >> a delayed work item to bring the qos back down to the default >> after the timeout. When the work item runs, pm_qos_work_fn() will >> call pm_qos_update_request() and deadlock because it tries to >> cancel itself via cancel_delayed_work_sync(). Future callers of >> that qos will also hang waiting to cancel the work that is >> canceling itself. Before ed1ac6e (PM: don't use >> [delayed_]work_pending(), 2013-01-11) this didn't happen because >> the work function wouldn't try to cancel itself. > I see. That must have been racy tho. If the work item execution > races someone else queuing the work item, the same deadlock could > happen, right? Yes you're right. It was always racy. > >> Let's just do the little bit of pm_qos_update_request() here so >> that we don't deadlock. >> >> Cc: Tejun Heo >> Signed-off-by: Stephen Boyd >> --- >> kernel/power/qos.c | 6 +- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/power/qos.c b/kernel/power/qos.c >> index 06fe285..d52d314 100644 >> --- a/kernel/power/qos.c >> +++ b/kernel/power/qos.c >> @@ -308,7 +308,11 @@ static void pm_qos_work_fn(struct work_struct *work) >>struct pm_qos_request, >>work); >> >> -pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); >> +if (PM_QOS_DEFAULT_VALUE != req->node.prio) >> +pm_qos_update_target( >> +pm_qos_array[req->pm_qos_class]->constraints, >> +&req->node, PM_QOS_UPDATE_REQ, >> +PM_QOS_DEFAULT_VALUE); > Maybe it'd be cleaner to add a param or internal variant of > pm_qos_update_request()? Maybe, but I was trying to make a minimal fix here. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
Hello, Stephen. On Thu, Aug 08, 2013 at 01:13:57PM -0700, Stephen Boyd wrote: > pm_qos_update_request_timeout() updates a qos and then schedules > a delayed work item to bring the qos back down to the default > after the timeout. When the work item runs, pm_qos_work_fn() will > call pm_qos_update_request() and deadlock because it tries to > cancel itself via cancel_delayed_work_sync(). Future callers of > that qos will also hang waiting to cancel the work that is > canceling itself. Before ed1ac6e (PM: don't use > [delayed_]work_pending(), 2013-01-11) this didn't happen because > the work function wouldn't try to cancel itself. I see. That must have been racy tho. If the work item execution races someone else queuing the work item, the same deadlock could happen, right? > Let's just do the little bit of pm_qos_update_request() here so > that we don't deadlock. > > Cc: Tejun Heo > Signed-off-by: Stephen Boyd > --- > kernel/power/qos.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/power/qos.c b/kernel/power/qos.c > index 06fe285..d52d314 100644 > --- a/kernel/power/qos.c > +++ b/kernel/power/qos.c > @@ -308,7 +308,11 @@ static void pm_qos_work_fn(struct work_struct *work) > struct pm_qos_request, > work); > > - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); > + if (PM_QOS_DEFAULT_VALUE != req->node.prio) > + pm_qos_update_target( > + pm_qos_array[req->pm_qos_class]->constraints, > + &req->node, PM_QOS_UPDATE_REQ, > + PM_QOS_DEFAULT_VALUE); Maybe it'd be cleaner to add a param or internal variant of pm_qos_update_request()? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
On 08/08/13 13:13, Stephen Boyd wrote: > pm_qos_update_request_timeout() updates a qos and then schedules > a delayed work item to bring the qos back down to the default > after the timeout. When the work item runs, pm_qos_work_fn() will > call pm_qos_update_request() and deadlock because it tries to > cancel itself via cancel_delayed_work_sync(). Future callers of > that qos will also hang waiting to cancel the work that is > canceling itself. Before ed1ac6e (PM: don't use > [delayed_]work_pending(), 2013-01-11) this didn't happen because > the work function wouldn't try to cancel itself. > > Let's just do the little bit of pm_qos_update_request() here so > that we don't deadlock. > > Cc: Tejun Heo > Signed-off-by: Stephen Boyd > --- Ping? > kernel/power/qos.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/power/qos.c b/kernel/power/qos.c > index 06fe285..d52d314 100644 > --- a/kernel/power/qos.c > +++ b/kernel/power/qos.c > @@ -308,7 +308,11 @@ static void pm_qos_work_fn(struct work_struct *work) > struct pm_qos_request, > work); > > - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); > + if (PM_QOS_DEFAULT_VALUE != req->node.prio) > + pm_qos_update_target( > + pm_qos_array[req->pm_qos_class]->constraints, > + &req->node, PM_QOS_UPDATE_REQ, > + PM_QOS_DEFAULT_VALUE); > } > > /** -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
pm_qos_update_request_timeout() updates a qos and then schedules a delayed work item to bring the qos back down to the default after the timeout. When the work item runs, pm_qos_work_fn() will call pm_qos_update_request() and deadlock because it tries to cancel itself via cancel_delayed_work_sync(). Future callers of that qos will also hang waiting to cancel the work that is canceling itself. Before ed1ac6e (PM: don't use [delayed_]work_pending(), 2013-01-11) this didn't happen because the work function wouldn't try to cancel itself. Let's just do the little bit of pm_qos_update_request() here so that we don't deadlock. Cc: Tejun Heo Signed-off-by: Stephen Boyd --- kernel/power/qos.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/power/qos.c b/kernel/power/qos.c index 06fe285..d52d314 100644 --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -308,7 +308,11 @@ static void pm_qos_work_fn(struct work_struct *work) struct pm_qos_request, work); - pm_qos_update_request(req, PM_QOS_DEFAULT_VALUE); + if (PM_QOS_DEFAULT_VALUE != req->node.prio) + pm_qos_update_target( + pm_qos_array[req->pm_qos_class]->constraints, + &req->node, PM_QOS_UPDATE_REQ, + PM_QOS_DEFAULT_VALUE); } /** -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v2.6.34-stable 28/77] p54spi: Fix workqueue deadlock
From: Michael Büsch --- This is a commit scheduled for the next v2.6.34 longterm release. http://git.kernel.org/?p=linux/kernel/git/paulg/longterm-queue-2.6.34.git If you see a problem with using this for longterm, please comment. --- commit 2d1618170eb493d18f66f2ac03775409a6fb97c6 upstream. priv->work must not be synced while priv->mutex is locked, because the mutex is taken in the work handler. Move cancel_work_sync down to after the device shutdown code. This is safe, because the work handler checks fw_state and bails out early in case of a race. Signed-off-by: Michael Buesch Acked-by: Christian Lamparter Signed-off-by: John W. Linville Signed-off-by: Paul Gortmaker --- drivers/net/wireless/p54/p54spi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/p54/p54spi.c b/drivers/net/wireless/p54/p54spi.c index 4cede24..8cf0301 100644 --- a/drivers/net/wireless/p54/p54spi.c +++ b/drivers/net/wireless/p54/p54spi.c @@ -582,8 +582,6 @@ static void p54spi_op_stop(struct ieee80211_hw *dev) WARN_ON(priv->fw_state != FW_STATE_READY); - cancel_work_sync(&priv->work); - p54spi_power_off(priv); spin_lock_irqsave(&priv->tx_lock, flags); INIT_LIST_HEAD(&priv->tx_pending); @@ -591,6 +589,8 @@ static void p54spi_op_stop(struct ieee80211_hw *dev) priv->fw_state = FW_STATE_OFF; mutex_unlock(&priv->mutex); + + cancel_work_sync(&priv->work); } static int __devinit p54spi_probe(struct spi_device *spi) -- 1.7.12.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Monday, 11 December 2006 07:52, Dipankar Sarma wrote: > On Sun, Dec 10, 2006 at 03:18:38PM +0100, Rafael J. Wysocki wrote: > > On Sunday, 10 December 2006 13:16, Andrew Morton wrote: > > > On Sun, 10 Dec 2006 12:49:43 +0100 > > > > Hm, currently we're using the CPU hotplug to disable the nonboot CPUs before > > the freezer is called. ;-) > > > > However, we're now trying to make the freezer SMP-safe, so that we can > > disable > > the nonboot CPUs after devices have been suspended (we want to do this for > > some ACPI-related reasons). In fact we're almost there, I'm only waiting > > for > > the confirmation from Pavel to post the patches. > > > > When this is done, we won't need the CPU hotplug that much and I think the > > CPU > > hotplug code will be able to do something like: > > > > freeze_processes > > suspend_cpufreq (or even suspend_all_devices) > > remove_the_CPU_in_question > > resume_cpufreq > > thaw_processes > > Have you thought about how much time this might take on > a machine with say - 128 CPUs of which I want to dynamically > reconvigure 64 of them and make a new partition ? Assume > 10,000 tasks are running in the system. Well, of course it doesn't makes sense to freeze processes and thaw them for each CPU again and again. Still the overall idea is to freeze processes and suspend devices, then do something with as many CPUs as necessary etc. Greetings, Rafael -- If you don't have the time to read, you don't have the time or the tools to write. - Stephen King - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Sun, Dec 10, 2006 at 03:18:38PM +0100, Rafael J. Wysocki wrote: > On Sunday, 10 December 2006 13:16, Andrew Morton wrote: > > On Sun, 10 Dec 2006 12:49:43 +0100 > > Hm, currently we're using the CPU hotplug to disable the nonboot CPUs before > the freezer is called. ;-) > > However, we're now trying to make the freezer SMP-safe, so that we can disable > the nonboot CPUs after devices have been suspended (we want to do this for > some ACPI-related reasons). In fact we're almost there, I'm only waiting for > the confirmation from Pavel to post the patches. > > When this is done, we won't need the CPU hotplug that much and I think the CPU > hotplug code will be able to do something like: > > freeze_processes > suspend_cpufreq (or even suspend_all_devices) > remove_the_CPU_in_question > resume_cpufreq > thaw_processes Have you thought about how much time this might take on a machine with say - 128 CPUs of which I want to dynamically reconvigure 64 of them and make a new partition ? Assume 10,000 tasks are running in the system. Thanks Dipankar - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
* Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > > { > > > > > > int cpu = raw_smp_processor_id(); > > > > > > /* > > > > > > * Interrupts/softirqs are hotplug-safe: > > > > > > */ > > > > > > if (in_interrupt()) > > > > > > return; > > > > > > if (current->hotplug_depth++) > > > > > > return; > > > > > > > > > > > > > > > current->hotplug_lock = &per_cpu(hotplug_lock, cpu); > > > > > > > > > > > > > > > mutex_lock(current->hotplug_lock); > > > > > > And it sleeps, so we can't use preempt_disable(). > > > > i explained it in the other mail - this is the 'read' side. The 'write' > > side (code actually wanting to /do/ a CPU hotplug state transition) has > > to take /all/ these locks before it can take a CPU down. > > Doesn't matter - the race is still there. > > Well, not really, because we don't free the percpu data of offlined > CPUs, but we'd like to. > > And it's easily fixable by using a statically-allocated array. That > would make life easier for code which wants to take this lock early in > boot too. yeah. but i think your freeze_processes() idea is even better - it would unify the suspend and the CPU-hotplug infrastructure even more. (and kprobes wants to use freeze_processes() too) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
> On Mon, 11 Dec 2006 11:15:45 +0530 Srivatsa Vaddagiri <[EMAIL PROTECTED]> > wrote: > On Sun, Dec 10, 2006 at 04:16:00AM -0800, Andrew Morton wrote: > > One quite different way of addressing all of this is to stop using > > stop_machine_run() for hotplug synchronisation and switch to the swsusp > > freezer infrastructure: all kernel threads and user processes need to stop > > and park themselves in a known state before we allow the CPU to be removed. > > lock_cpu_hotplug() becomes a no-op. > > Well ...you still need to provide some mechanism for stable access to > cpu_online_map in blocking functions (ex: do_event_scan_all_cpus). > Freezing-tasks/Resuming-them-after-hotp-unplug is definitely not one of them > (when they resume, online_map would have changed under their feet). Problems will only occur if a process is holding some sort of per-cpu state across a call to try_to_freeze(). Surely nobody does that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Sun, Dec 10, 2006 at 04:16:00AM -0800, Andrew Morton wrote: > One quite different way of addressing all of this is to stop using > stop_machine_run() for hotplug synchronisation and switch to the swsusp > freezer infrastructure: all kernel threads and user processes need to stop > and park themselves in a known state before we allow the CPU to be removed. > lock_cpu_hotplug() becomes a no-op. Well ...you still need to provide some mechanism for stable access to cpu_online_map in blocking functions (ex: do_event_scan_all_cpus). Freezing-tasks/Resuming-them-after-hotp-unplug is definitely not one of them (when they resume, online_map would have changed under their feet). > Dunno if it'll work - I only just thought of it. It sure would simplify > things. -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Sun, Dec 10, 2006 at 09:26:16AM +0100, Ingo Molnar wrote: > something like the pseudocode further below - when applied to a data > structure it has semantics and scalability close to that of > preempt_disable(), but it is still preemptible and the lock is specific. Ingo, The psuedo-code you have provided can still fail to avoid the deadlock reported by Bjorn Helgaas earlier in this thread: http://lkml.org/lkml/2006/12/6/352 Thread1->flush_workqueue->mutex_lock(cpu4's hotplug_lock) Thread2(keventd)->run_workqueue->som_work_fn-> .. flush_workqueue->mutex_lock(cpu4's hotplug_lock) Both deadlock with each other. All this mess could easily be avoided if we implement a reference-count based cpu_hotplug_lock(), as suggested by Arjan and Linus before and implemented by Gautham here: http://lkml.org/lkml/2006/10/26/65 -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Sat, 9 Dec 2006 11:26:52 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > + if (cpu != -1) > > > > + mutex_lock(&workqueue_mutex); > > > > > > events/4 thread itself wanting the same mutex above? > > > > Could do, not sure. I'm planning on converting all the locking around > > here to preempt_disable() though. > > please at least use an owner-recursive per-CPU lock, a wot? > not a naked > preempt_disable()! The concurrency rules for data structures changed via > preempt_disable() are quite hard to sort out after the fact. > (preempt_disable() is too opaque, preempt_disable() is the preferred way of holding off cpu hotplug. > it doesnt attach data structure to > critical section, like normal locks do.) the data structure is the CPU, and its per-cpu data. And cpu_online_map. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
* Andrew Morton <[EMAIL PROTECTED]> wrote: > > > + if (cpu != -1) > > > + mutex_lock(&workqueue_mutex); > > > > events/4 thread itself wanting the same mutex above? > > Could do, not sure. I'm planning on converting all the locking around > here to preempt_disable() though. please at least use an owner-recursive per-CPU lock, not a naked preempt_disable()! The concurrency rules for data structures changed via preempt_disable() are quite hard to sort out after the fact. (preempt_disable() is too opaque, it doesnt attach data structure to critical section, like normal locks do.) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Thu, Dec 07, 2006 at 08:54:07PM -0800, Andrew Morton wrote: > Could do, not sure. AFAICS it will deadlock for sure. > I'm planning on converting all the locking around here > to preempt_disable() though. Will look forward to that patch. Its hard to dance around w/o a lock_cpu_hotplug() ..:) -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Fri, 8 Dec 2006 08:23:01 +0530 Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote: > On Thu, Dec 07, 2006 at 11:37:00AM -0800, Andrew Morton wrote: > > -static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq) > > +/* > > + * If cpu == -1 it's a single-threaded workqueue and the caller does not > > hold > > + * workqueue_mutex > > + */ > > +static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq, int cpu) > > Lets say @cpu = 4 > > > { > > if (cwq->thread == current) { > > /* > > * Probably keventd trying to flush its own queue. So simply run > > * it by hand rather than deadlocking. > > */ > > + if (cpu != -1) > > + mutex_unlock(&workqueue_mutex); > > Lets say we release the workqueue mutex here (events/4 is trying to > flush its own workqueue). Immediately another CPU takes this mutex > (in CPU_DOWN_PREPARE) and brings down CPU4. In CPU_DEAD handling we now wait > on events/4 thread to exit (cleanup_workqueue_thread). > > Couldnt this wait deadlock on : > > > run_workqueue(cwq); > > > + if (cpu != -1) > > + mutex_lock(&workqueue_mutex); > > events/4 thread itself wanting the same mutex above? > Could do, not sure. I'm planning on converting all the locking around here to preempt_disable() though. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Thu, Dec 07, 2006 at 11:37:00AM -0800, Andrew Morton wrote: > -static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq) > +/* > + * If cpu == -1 it's a single-threaded workqueue and the caller does not hold > + * workqueue_mutex > + */ > +static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq, int cpu) Lets say @cpu = 4 > { > if (cwq->thread == current) { > /* >* Probably keventd trying to flush its own queue. So simply run >* it by hand rather than deadlocking. >*/ > + if (cpu != -1) > + mutex_unlock(&workqueue_mutex); Lets say we release the workqueue mutex here (events/4 is trying to flush its own workqueue). Immediately another CPU takes this mutex (in CPU_DOWN_PREPARE) and brings down CPU4. In CPU_DEAD handling we now wait on events/4 thread to exit (cleanup_workqueue_thread). Couldnt this wait deadlock on : > run_workqueue(cwq); > + if (cpu != -1) > + mutex_lock(&workqueue_mutex); events/4 thread itself wanting the same mutex above? What am I missing? -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Thu, 7 Dec 2006 10:51:48 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > + if (!cpu_online(cpu)) /* oops, CPU got unplugged */ > + goto bail; hm, actually we can pull the same trick with flush_scheduled_work(). Should fix quite a few things... From: Andrew Morton <[EMAIL PROTECTED]> We have a class of deadlocks where the flush_scheduled_work() caller can get stuck waiting for a work to complete, where that work wants to take workqueue_mutex for some reason. Fix this by not holding workqueue_mutex when waiting for a workqueue to flush. The patch assumes that the per-cpu workqueue won't get freed up while there's a task waiting on cpu_workqueue_struct.work_done. If that can happen, run_workqueue() would crash anyway. Cc: Bjorn Helgaas <[EMAIL PROTECTED]> Cc: Ingo Molnar <[EMAIL PROTECTED]> Cc: Srivatsa Vaddagiri <[EMAIL PROTECTED]> Cc: Gautham shenoy <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- kernel/workqueue.c | 22 +++--- 1 files changed, 19 insertions(+), 3 deletions(-) diff -puN kernel/workqueue.c~workqueue-dont-hold-workqueue_mutex-in-flush_scheduled_work kernel/workqueue.c --- a/kernel/workqueue.c~workqueue-dont-hold-workqueue_mutex-in-flush_scheduled_work +++ a/kernel/workqueue.c @@ -325,14 +325,22 @@ static int worker_thread(void *__cwq) return 0; } -static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq) +/* + * If cpu == -1 it's a single-threaded workqueue and the caller does not hold + * workqueue_mutex + */ +static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq, int cpu) { if (cwq->thread == current) { /* * Probably keventd trying to flush its own queue. So simply run * it by hand rather than deadlocking. */ + if (cpu != -1) + mutex_unlock(&workqueue_mutex); run_workqueue(cwq); + if (cpu != -1) + mutex_lock(&workqueue_mutex); } else { DEFINE_WAIT(wait); long sequence_needed; @@ -344,7 +352,14 @@ static void flush_cpu_workqueue(struct c prepare_to_wait(&cwq->work_done, &wait, TASK_UNINTERRUPTIBLE); spin_unlock_irq(&cwq->lock); + if (cpu != -1) + mutex_unlock(&workqueue_mutex); schedule(); + if (cpu != -1) { + mutex_lock(&workqueue_mutex); + if (!cpu_online(cpu)) + return; /* oops, CPU unplugged */ + } spin_lock_irq(&cwq->lock); } finish_wait(&cwq->work_done, &wait); @@ -373,13 +388,14 @@ void fastcall flush_workqueue(struct wor if (is_single_threaded(wq)) { /* Always use first cpu's area. */ - flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, singlethread_cpu)); + flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, singlethread_cpu), + -1); } else { int cpu; mutex_lock(&workqueue_mutex); for_each_online_cpu(cpu) - flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu)); + flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu), cpu); mutex_unlock(&workqueue_mutex); } } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Wed, 6 Dec 2006 17:26:14 -0700 Bjorn Helgaas <[EMAIL PROTECTED]> wrote: > I'm seeing a workqueue-related deadlock. This is on an ia64 > box running SLES10, but it looks like the same problem should > be possible in current upstream on any architecture. > > Here are the two tasks involved: > > events/4: > schedule > __down > __lock_cpu_hotplug > lock_cpu_hotplug > flush_workqueue > kblockd_flush > blk_sync_queue > cfq_shutdown_timer_wq > cfq_exit_queue > elevator_exit > blk_cleanup_queue > scsi_free_queue > scsi_device_dev_release_usercontext > run_workqueue > > loadkeys: > schedule > flush_cpu_workqueue > flush_workqueue > flush_scheduled_work > release_dev > tty_release This will go away if/when I get the proposed new flush_work(struct work_struct *) implemented. We can then convert blk_sync_queue() to do flush_work(&q->unplug_work); which will only block if blk_unplug_work() is actually executing on this queue, and which will return as soon as blk_unplug_work() has finished. (And a similar change in release_dev()). It doesn't solve the fundamental problem though. But I'm not sure what that is. If it is "flush_scheduled_work() waits on things which the caller isn't interested in" then it will fix the fundamental problem. Needs more work: diff -puN kernel/workqueue.c~implement-flush_work kernel/workqueue.c --- a/kernel/workqueue.c~implement-flush_work +++ a/kernel/workqueue.c @@ -53,6 +53,7 @@ struct cpu_workqueue_struct { struct workqueue_struct *wq; struct task_struct *thread; + struct work_struct *current_work; int run_depth; /* Detect run_workqueue() recursion depth */ } cacheline_aligned; @@ -243,6 +244,7 @@ static void run_workqueue(struct cpu_wor work_func_t f = work->func; list_del_init(cwq->worklist.next); + cwq->current_work = work; spin_unlock_irqrestore(&cwq->lock, flags); BUG_ON(get_wq_data(work) != cwq); @@ -251,6 +253,7 @@ static void run_workqueue(struct cpu_wor f(work); spin_lock_irqsave(&cwq->lock, flags); + cwq->current_work = NULL; cwq->remove_sequence++; wake_up(&cwq->work_done); } @@ -330,6 +333,70 @@ static void flush_cpu_workqueue(struct c } } +static void wait_on_work(struct cpu_workqueue_struct *cwq, + struct work_struct *work, int cpu) +{ + DEFINE_WAIT(wait); + + spin_lock_irq(&cwq->lock); + while (cwq->current_work == work) { + prepare_to_wait(&cwq->work_done, &wait, TASK_UNINTERRUPTIBLE); + spin_unlock_irq(&cwq->lock); + mutex_unlock(&workqueue_mutex); + schedule(); + mutex_lock(&workqueue_mutex); + if (!cpu_online(cpu)) /* oops, CPU got unplugged */ + goto bail; + spin_lock_irq(&cwq->lock); + } + spin_unlock_irq(&cwq->lock); +bail: + finish_wait(&cwq->work_done, &wait); +} + +static void flush_one_work(struct cpu_workqueue_struct *cwq, + struct work_struct *work, int cpu) +{ + spin_lock_irq(&cwq->lock); + if (test_and_clear_bit(WORK_STRUCT_PENDING, &work->management)) { + list_del_init(&work->entry); + spin_unlock_irq(&cwq->lock); + return; + } + spin_unlock_irq(&cwq->lock); + + /* It's running, or it has completed */ + + if (cwq->thread == current) { + /* This stinks */ + /* +* Probably keventd trying to flush its own queue. So simply run +* it by hand rather than deadlocking. +*/ + run_workqueue(cwq); + } else { + wait_on_work(cwq, work, cpu); + } +} + +void flush_work(struct workqueue_struct *wq, struct work_struct *work) +{ + might_sleep(); + + mutex_lock(&workqueue_mutex); + if (is_single_threaded(wq)) { + /* Always use first cpu's area. */ + flush_one_work(per_cpu_ptr(wq->cpu_wq, singlethread_cpu), work, + singlethread_cpu); + } else { + int cpu; + + for_each_online_cpu(cpu) + flush_one_work(per_cpu_ptr(wq->cpu_wq, cpu), work, cpu); + } + mutex_unlock(&workqueue_mutex); +} + /** * flush_workqueue - ensure that any scheduled work has run to completion. * @wq: workqueue to flush _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Thu, Dec 07, 2006 at 11:47:01AM +0530, Srivatsa Vaddagiri wrote: > - Make it rw-sem I think rw-sems also were shown to hit deadlocks (recursive read-lock attempt deadlocks when a writer comes between the two read attempts by the same thread). So below suggestion only seems to makes sense .. > - Make it per-cpu mutex, which could be either: > > http://lkml.org/lkml/2006/11/30/110 - Ingo's suggestion > http://lkml.org/lkml/2006/10/26/65 - Gautham's work based on RCU > > In Ingo's suggestion, I really dont know if the task_struct > modifications is a good thing (to support recursive requirements). > Gautham's patches avoid modifications to task_struct, is fast but can > starve writers (who want to bring down/up a CPU). -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: workqueue deadlock
On Wed, Dec 06, 2006 at 05:26:14PM -0700, Bjorn Helgaas wrote: > loadkeys is holding the cpu_hotplug lock (acquired in flush_workqueue()) > and waiting in flush_cpu_workqueue() until the cpu_workqueue drains. > > But events/4 is responsible for draining it, and it is blocked waiting > to acquire the cpu_hotplug lock. > > In current upstream, the cpu_hotplug lock has been replaced with > workqueue_mutex, but it looks to me like the same deadlock is still > possible. Yes I think so too. > Is there some rule that workqueue functions shouldn't try to > flush a workqueue? In general, workqueue functions wanting to flush workqueue seems wierd to me. But in this case, I think the requirement is to block until all queued work is complete, which is what flush_workqueue is supposed to do. Hence I dont see any way to avoid it .. > Or should flush_workqueue() be smarter by > releasing the workqueue_mutex once in a while? IMHO, rehauling lock_cpu_hotplug() to support scenarios like this is a better approach. - Make it rw-sem - Make it per-cpu mutex, which could be either: http://lkml.org/lkml/2006/11/30/110 - Ingo's suggestion http://lkml.org/lkml/2006/10/26/65 - Gautham's work based on RCU In Ingo's suggestion, I really dont know if the task_struct modifications is a good thing (to support recursive requirements). Gautham's patches avoid modifications to task_struct, is fast but can starve writers (who want to bring down/up a CPU). -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
workqueue deadlock
I'm seeing a workqueue-related deadlock. This is on an ia64 box running SLES10, but it looks like the same problem should be possible in current upstream on any architecture. Here are the two tasks involved: events/4: schedule __down __lock_cpu_hotplug lock_cpu_hotplug flush_workqueue kblockd_flush blk_sync_queue cfq_shutdown_timer_wq cfq_exit_queue elevator_exit blk_cleanup_queue scsi_free_queue scsi_device_dev_release_usercontext run_workqueue loadkeys: schedule flush_cpu_workqueue flush_workqueue flush_scheduled_work release_dev tty_release loadkeys is holding the cpu_hotplug lock (acquired in flush_workqueue()) and waiting in flush_cpu_workqueue() until the cpu_workqueue drains. But events/4 is responsible for draining it, and it is blocked waiting to acquire the cpu_hotplug lock. In current upstream, the cpu_hotplug lock has been replaced with workqueue_mutex, but it looks to me like the same deadlock is still possible. Is there some rule that workqueue functions shouldn't try to flush a workqueue? Or should flush_workqueue() be smarter by releasing the workqueue_mutex once in a while? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/