Re: [PATCH] cpuidle: Measure idle state durations with monotonic clock
On 11/15/2012 04:04 AM, Preeti Murthy wrote: > Hi all, > > The code looks correct and inviting to me as it has led to good cleanups. > I dont think passing 0 as the argument to the function > sched_clock_idle_wakeup_event() > should lead to problems,as it does not do anything useful with the > passed arguments. > > My only curiosity is what was the purpose of passing idle residency time to > sched_clock_idle_wakeup_event() when this data could always be retrieved from > dev->last_residency for each cpu,which gets almost immediately updated. sched_clock_idle_wakeup_event() is part of the scheduler. The scheduler doesn't know what a cpuidle_device is, and probably should not grow such a dependency. cheers, -Len Brown, Intel Open Source Technology Center > But this does not seem to come in way of this patch for now.Anyway I > have added Peter to > the list so that he can opine about this issue if possible and needed. > > Reviewed-by: Preeti U Murthy > > > Regards > Preeti U Murthy > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpuidle: Measure idle state durations with monotonic clock
>> drivers/idle/intel_idle.c | 14 +- Acked-by: Len Brown thanks! -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] intel_idle: stop using driver_data for static flags
>> I think that a driver's private flag definitions >> should remain local to the driver. It makes no sense >> to pollute the name space of other drivers for stuff >> that doesn't mean anything to them. MWAIT is pretty >> specific to x86 -- and re-naming it to something 'generic' >> isn't going to make the code easier to read. > > Ok, let me rephrase it because I think how it was presented was not clear. > > As we want to use the half of the state flags for private purpose, I > suggested to add the encoding/decoding function in the shared header file. > > The mwait eax flags are not encoded and the CPUIDLE_FLAG_TLB_FLUSHED is > encoded. > > I suggested to unify both and to use an encoding function from the > shared header file. > > #define CPUIDLE_PRIVATE_FLAGS(_flags_) \ > ((_flags_) << 16) & CPUIDLE_DRIVER_FLAGS_MASK > > For example: > > #define FLAG_TLB_FLUSHED CPUIDLE_PRIVATE_FLAGS(0x1) > #define FLAG_MWAIT_C1CPUIDLE_PRIVATE_FLAGS(0x0) > #define FLAG_MWAIT_C2CPUIDLE_PRIVATE_FLAGS(0x10) > #define FLAG_MWAIT_C3CPUIDLE_PRIVATE_FLAGS(0x20) > #define FLAG_MWAIT_C4CPUIDLE_PRIVATE_FLAGS(0x30) > #define FLAG_MWAIT_C5CPUIDLE_PRIVATE_FLAGS(0x40) > #define FLAG_MWAIT_C6CPUIDLE_PRIVATE_FLAGS(0x52) > > And then: > > ... > > .flags = FLAG_TLB_FLUSHED | FLAG_MWAIT_C2 | CPUIDLE_FLAG_TIME_VALID I like your syntax better than what I used -- thanks for suggesting it. I can't use it as-is, however, because I really do need to be able to support unique parameters on each entry (eg. different flags for C3 on different processors). When some of the logical and functional changes in flux settle out I'll come back to re-visit the syntax. thanks, -Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 01/16] intel_idle: stop using driver_data for static flags
From: Len Brown The commit, 4202735e8ab6ecfb0381631a0d0b58fefe0bd4e2 (cpuidle: Split cpuidle_state structure and move per-cpu statistics fields) observed that the MWAIT flags for Cn on every processor to date were the same, and created get_driver_data() to supply them. Unfortunately, that assumption is false, going forward. So here we restore the MWAIT flags to the cpuidle_state table. However, instead restoring the old "driver_data" field, we put the flags into the existing "flags" field, where they probalby should have lived all along. This patch does not change any operation. This patch removes 1 of the 3 users of cpuidle_state_usage.driver_data. Perhaps some day we'll get rid of the other 2. Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 75 --- 1 file changed, 26 insertions(+), 49 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 2df9414..b2cf489 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -109,6 +109,16 @@ static struct cpuidle_state *cpuidle_state_table; #define CPUIDLE_FLAG_TLB_FLUSHED 0x1 /* + * MWAIT takes an 8-bit "hint" in EAX "suggesting" + * the C-state (top nibble) and sub-state (bottom nibble) + * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc. + * + * We store the hint at the top of our "flags" for each state. + */ +#define flg2MWAIT(flags) (((flags) >> 24) & 0xFF) +#define MWAIT2flg(eax) ((eax & 0xFF) << 24) + +/* * States are indexed by the cstate number, * which is also the index into the MWAIT hint array. * Thus C0 is a dummy. @@ -118,21 +128,21 @@ static struct cpuidle_state nehalem_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C1 */ .name = "C1-NHM", .desc = "MWAIT 0x00", - .flags = CPUIDLE_FLAG_TIME_VALID, + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 3, .target_residency = 6, .enter = &intel_idle }, { /* MWAIT C2 */ .name = "C3-NHM", .desc = "MWAIT 0x10", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 20, .target_residency = 80, .enter = &intel_idle }, { /* MWAIT C3 */ .name = "C6-NHM", .desc = "MWAIT 0x20", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 200, .target_residency = 800, .enter = &intel_idle }, @@ -143,28 +153,28 @@ static struct cpuidle_state snb_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C1 */ .name = "C1-SNB", .desc = "MWAIT 0x00", - .flags = CPUIDLE_FLAG_TIME_VALID, + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 1, .target_residency = 1, .enter = &intel_idle }, { /* MWAIT C2 */ .name = "C3-SNB", .desc = "MWAIT 0x10", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 80, .target_residency = 211, .enter = &intel_idle }, { /* MWAIT C3 */ .name = "C6-SNB", .desc = "MWAIT 0x20", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 104, .target_residency = 345, .enter = &intel_idle }, { /* MWAIT C4 */ .name = "C7-SNB", .desc = "MWAIT 0x30", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 109, .target_residency = 345, .enter = &intel_idle }, @@ -175,28 +185,28 @@ static struct cpuidle_state ivb_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C1 */ .name = "C1-IVB", .desc = "MWAIT 0x00", - .flags = CPUIDLE_FLAG_TIME_VALID, + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID,
[PATCH 10/16] cpuidle: remove vestage definition of cpuidle_state_usage.driver_data
From: Len Brown This field is no longer used. Signed-off-by: Len Brown --- include/linux/cpuidle.h | 22 -- 1 file changed, 22 deletions(-) diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 24cd1037..480c14d 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -32,8 +32,6 @@ struct cpuidle_driver; / struct cpuidle_state_usage { - void*driver_data; - unsigned long long disable; unsigned long long usage; unsigned long long time; /* in US */ @@ -62,26 +60,6 @@ struct cpuidle_state { #define CPUIDLE_DRIVER_FLAGS_MASK (0x) -/** - * cpuidle_get_statedata - retrieves private driver state data - * @st_usage: the state usage statistics - */ -static inline void *cpuidle_get_statedata(struct cpuidle_state_usage *st_usage) -{ - return st_usage->driver_data; -} - -/** - * cpuidle_set_statedata - stores private driver state data - * @st_usage: the state usage statistics - * @data: the private data - */ -static inline void -cpuidle_set_statedata(struct cpuidle_state_usage *st_usage, void *data) -{ - st_usage->driver_data = data; -} - struct cpuidle_device { unsigned intregistered:1; unsigned intenabled:1; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 14/16] intel_idle: remove use and definition of MWAIT_MAX_NUM_CSTATES
From: Len Brown Cosmetic only. Replace use of MWAIT_MAX_NUM_CSTATES with CPUIDLE_STATE_MAX. They are both 8, so this patch has no functional change. The reason to change is that intel_idle will soon be able to export more than the 8 "major" states supported by MWAIT. When we hit that limit, it is important to know where the limit comes from. Signed-off-by: Len Brown --- arch/x86/include/asm/mwait.h | 1 - drivers/idle/intel_idle.c| 16 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index bcdff99..3f44732 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -4,7 +4,6 @@ #define MWAIT_SUBSTATE_MASK0xf #define MWAIT_CSTATE_MASK 0xf #define MWAIT_SUBSTATE_SIZE4 -#define MWAIT_MAX_NUM_CSTATES 8 #define CPUID_MWAIT_LEAF 5 #define CPUID5_ECX_EXTENSIONS_SUPPORTED 0x1 diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index fa71477..c949a6f 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -74,7 +74,7 @@ static struct cpuidle_driver intel_idle_driver = { .en_core_tk_irqen = 1, }; /* intel_idle.max_cstate=0 disables driver */ -static int max_cstate = MWAIT_MAX_NUM_CSTATES - 1; +static int max_cstate = CPUIDLE_STATE_MAX - 1; static unsigned int mwait_substates; @@ -123,7 +123,7 @@ static struct cpuidle_state *cpuidle_state_table; * which is also the index into the MWAIT hint array. * Thus C0 is a dummy. */ -static struct cpuidle_state nehalem_cstates[MWAIT_MAX_NUM_CSTATES] = { +static struct cpuidle_state nehalem_cstates[CPUIDLE_STATE_MAX] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ .name = "C1-NHM", @@ -148,7 +148,7 @@ static struct cpuidle_state nehalem_cstates[MWAIT_MAX_NUM_CSTATES] = { .enter = &intel_idle }, }; -static struct cpuidle_state snb_cstates[MWAIT_MAX_NUM_CSTATES] = { +static struct cpuidle_state snb_cstates[CPUIDLE_STATE_MAX] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ .name = "C1-SNB", @@ -180,7 +180,7 @@ static struct cpuidle_state snb_cstates[MWAIT_MAX_NUM_CSTATES] = { .enter = &intel_idle }, }; -static struct cpuidle_state ivb_cstates[MWAIT_MAX_NUM_CSTATES] = { +static struct cpuidle_state ivb_cstates[CPUIDLE_STATE_MAX] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ .name = "C1-IVB", @@ -212,7 +212,7 @@ static struct cpuidle_state ivb_cstates[MWAIT_MAX_NUM_CSTATES] = { .enter = &intel_idle }, }; -static struct cpuidle_state hsw_cstates[MWAIT_MAX_NUM_CSTATES] = { +static struct cpuidle_state hsw_cstates[CPUIDLE_STATE_MAX] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ .name = "C1-HSW", @@ -244,7 +244,7 @@ static struct cpuidle_state hsw_cstates[MWAIT_MAX_NUM_CSTATES] = { .enter = &intel_idle }, }; -static struct cpuidle_state atom_cstates[MWAIT_MAX_NUM_CSTATES] = { +static struct cpuidle_state atom_cstates[CPUIDLE_STATE_MAX] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ .name = "C1-ATM", @@ -503,7 +503,7 @@ static int intel_idle_cpuidle_driver_init(void) drv->state_count = 1; - for (cstate = 1; cstate < MWAIT_MAX_NUM_CSTATES; ++cstate) { + for (cstate = 1; cstate < CPUIDLE_STATE_MAX; ++cstate) { int num_substates; if (cstate > max_cstate) { @@ -560,7 +560,7 @@ static int intel_idle_cpu_init(int cpu) dev->state_count = 1; - for (cstate = 1; cstate < MWAIT_MAX_NUM_CSTATES; ++cstate) { + for (cstate = 1; cstate < CPUIDLE_STATE_MAX; ++cstate) { int num_substates; if (cstate > max_cstate) { -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/16] intel_idle: export both C1 and C1E
From: Len Brown Here we disable HW promotion of C1 to C1E and export both C1 and C1E and distinct C-states. This allows a cpuidle governor to choose a lower latency C-state than C1E when necessary to satisfy performance and QOS constraints -- and still save power versus polling. This also corrects the erroneous latency previously reported for C1E -- it is 10usec, not 1usec. Note that if you use "intel_idle.max_cstate=N", then you must increment N by 1 to get the same behavior after this change. Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 55 ++- 1 file changed, 50 insertions(+), 5 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 927cfb4..fe2d397 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -90,6 +90,7 @@ struct idle_cpu { * Indicate which enable bits to clear here. */ unsigned long auto_demotion_disable_flags; + bool disable_promotion_to_c1e; }; static const struct idle_cpu *icpu; @@ -132,6 +133,13 @@ static struct cpuidle_state nehalem_cstates[CPUIDLE_STATE_MAX] = { .target_residency = 6, .enter = &intel_idle }, { + .name = "C1E-NHM", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 20, + .enter = &intel_idle }, + { .name = "C3-NHM", .desc = "MWAIT 0x10", .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, @@ -154,8 +162,15 @@ static struct cpuidle_state snb_cstates[CPUIDLE_STATE_MAX] = { .name = "C1-SNB", .desc = "MWAIT 0x00", .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, - .exit_latency = 1, - .target_residency = 1, + .exit_latency = 2, + .target_residency = 2, + .enter = &intel_idle }, + { + .name = "C1E-SNB", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 20, .enter = &intel_idle }, { .name = "C3-SNB", @@ -191,6 +206,13 @@ static struct cpuidle_state ivb_cstates[CPUIDLE_STATE_MAX] = { .target_residency = 1, .enter = &intel_idle }, { + .name = "C1E-IVB", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 20, + .enter = &intel_idle }, + { .name = "C3-IVB", .desc = "MWAIT 0x10", .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, @@ -224,6 +246,13 @@ static struct cpuidle_state hsw_cstates[CPUIDLE_STATE_MAX] = { .target_residency = 2, .enter = &intel_idle }, { + .name = "C1E-HSW", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 20, + .enter = &intel_idle }, + { .name = "C3-HSW", .desc = "MWAIT 0x10", .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, @@ -250,11 +279,11 @@ static struct cpuidle_state hsw_cstates[CPUIDLE_STATE_MAX] = { static struct cpuidle_state atom_cstates[CPUIDLE_STATE_MAX] = { { - .name = "C1-ATM", + .name = "C1E-ATM", .desc = "MWAIT 0x00", .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, - .exit_latency = 1, - .target_residency = 4, + .exit_latency = 10, + .target_residency = 20, .enter = &intel_idle }, { .name = "C2-ATM", @@ -377,10 +406,19 @@ static void auto_demotion_disable(void *dummy) msr_bits &= ~(icpu->auto_demotion_disable_flags); wrmsrl(MSR_NHM_SNB_PKG_CST_CFG_CTL, msr_bits); } +static void c1e_promotion_disable(void *dummy) +{ + unsigned long long msr_bits; + + rdmsrl(MSR_IA32_POWER_CTL, msr_bits); + msr_bits &= ~0x2; + wrmsrl(MSR_IA32_POWER_CTL, msr_bits); +} static const struct idle_cpu idle_cpu_nehalem = { .state_table = nehalem_cstates, .auto_demotion_disable_flags = NHM_C1_AUTO
[PATCH 02/16] Replace the flag by a simple global boolean in the cpuidle.c. That will allow to cleanup the rest of the code right after, because the ops won't make sense.
From: Daniel Lezcano Signed-off-by: Daniel Lezcano Acked-by: Sekhar Nori Signed-off-by: Len Brown --- arch/arm/mach-davinci/cpuidle.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/arch/arm/mach-davinci/cpuidle.c b/arch/arm/mach-davinci/cpuidle.c index 9107691..5fbd470 100644 --- a/arch/arm/mach-davinci/cpuidle.c +++ b/arch/arm/mach-davinci/cpuidle.c @@ -26,8 +26,8 @@ #define DAVINCI_CPUIDLE_MAX_STATES 2 struct davinci_ops { - void (*enter) (u32 flags); - void (*exit) (u32 flags); + void (*enter) (void); + void (*exit) (void); u32 flags; }; @@ -40,20 +40,17 @@ static int davinci_enter_idle(struct cpuidle_device *dev, struct davinci_ops *ops = cpuidle_get_statedata(state_usage); if (ops && ops->enter) - ops->enter(ops->flags); + ops->enter(); index = cpuidle_wrap_enter(dev, drv, index, arm_cpuidle_simple_enter); if (ops && ops->exit) - ops->exit(ops->flags); + ops->exit(); return index; } -/* fields in davinci_ops.flags */ -#define DAVINCI_CPUIDLE_FLAGS_DDR2_PWDNBIT(0) - static struct cpuidle_driver davinci_idle_driver = { .name = "cpuidle-davinci", .owner = THIS_MODULE, @@ -72,6 +69,7 @@ static struct cpuidle_driver davinci_idle_driver = { static DEFINE_PER_CPU(struct cpuidle_device, davinci_cpuidle_device); static void __iomem *ddr2_reg_base; +static bool ddr2_pdown; static void davinci_save_ddr_power(int enter, bool pdown) { @@ -92,14 +90,14 @@ static void davinci_save_ddr_power(int enter, bool pdown) __raw_writel(val, ddr2_reg_base + DDR2_SDRCR_OFFSET); } -static void davinci_c2state_enter(u32 flags) +static void davinci_c2state_enter(void) { - davinci_save_ddr_power(1, !!(flags & DAVINCI_CPUIDLE_FLAGS_DDR2_PWDN)); + davinci_save_ddr_power(1, ddr2_pdown); } -static void davinci_c2state_exit(u32 flags) +static void davinci_c2state_exit(void) { - davinci_save_ddr_power(0, !!(flags & DAVINCI_CPUIDLE_FLAGS_DDR2_PWDN)); + davinci_save_ddr_power(0, ddr2_pdown); } static struct davinci_ops davinci_states[DAVINCI_CPUIDLE_MAX_STATES] = { @@ -124,8 +122,7 @@ static int __init davinci_cpuidle_probe(struct platform_device *pdev) ddr2_reg_base = pdata->ddr2_ctlr_base; - if (pdata->ddr2_pdown) - davinci_states[1].flags |= DAVINCI_CPUIDLE_FLAGS_DDR2_PWDN; + ddr2_pdown = pdata->ddr2_pdown; cpuidle_set_statedata(&device->states_usage[1], &davinci_states[1]); device->state_count = DAVINCI_CPUIDLE_MAX_STATES; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/16] tools/power turbostat: support Haswell
From: Len Brown This patch enables turbostat to run properly on the next-generation Intel(R) Microarchitecture, code named "Haswell" (HSW). HSW supports the BCLK and counters found in SNB. Signed-off-by: Len Brown --- tools/power/x86/turbostat/turbostat.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index ce6d460..b326878 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -1397,6 +1397,9 @@ int has_nehalem_turbo_ratio_limit(unsigned int family, unsigned int model) case 0x2D: /* SNB Xeon */ case 0x3A: /* IVB */ case 0x3E: /* IVB Xeon */ + case 0x3C: /* HSW */ + case 0x3F: /* HSW */ + case 0x45: /* HSW */ return 1; case 0x2E: /* Nehalem-EX Xeon - Beckton */ case 0x2F: /* Westmere-EX Xeon - Eagleton */ @@ -1488,6 +1491,9 @@ void rapl_probe(unsigned int family, unsigned int model) switch (model) { case 0x2A: case 0x3A: + case 0x3C: /* HSW */ + case 0x3F: /* HSW */ + case 0x45: /* HSW */ do_rapl = RAPL_PKG | RAPL_CORES | RAPL_GFX; break; case 0x2D: @@ -1724,6 +1730,9 @@ int is_snb(unsigned int family, unsigned int model) case 0x2D: case 0x3A: /* IVB */ case 0x3E: /* IVB Xeon */ + case 0x3C: /* HSW */ + case 0x3F: /* HSW */ + case 0x45: /* HSW */ return 1; } return 0; @@ -2248,7 +2257,7 @@ int main(int argc, char **argv) cmdline(argc, argv); if (verbose) - fprintf(stderr, "turbostat v3.0 November 23, 2012" + fprintf(stderr, "turbostat v3.1 January 8, 2013" " - Len Brown \n"); turbostat_init(); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 15/16] intel_idle: remove assumption of one C-state per MWAIT flag
From: Len Brown Remove the assumption that cstate_tables are indexed by MWAIT flag values. Each entry identifies itself via its own flags value. This change is needed to support multiple states that share the same MWAIT flags. Note that this can have an effect on what state is described by 'N' on cmdline intel_idle.max_cstate=N on some systems. intel_idle.max_cstate=0 still disables the driver intel_idle.max_cstate=1 still results in just C1(E) However, "place holders" in the sparse C-state name-space (eg. Atom) have been removed. Signed-off-by: Len Brown --- arch/x86/include/asm/mwait.h | 2 + drivers/idle/intel_idle.c| 110 +++ 2 files changed, 61 insertions(+), 51 deletions(-) diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index 3f44732..2f366d0 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -4,6 +4,8 @@ #define MWAIT_SUBSTATE_MASK0xf #define MWAIT_CSTATE_MASK 0xf #define MWAIT_SUBSTATE_SIZE4 +#define MWAIT_HINT2CSTATE(hint)(((hint) >> MWAIT_SUBSTATE_SIZE) & MWAIT_CSTATE_MASK) +#define MWAIT_HINT2SUBSTATE(hint) ((hint) & MWAIT_CSTATE_MASK) #define CPUID_MWAIT_LEAF 5 #define CPUID5_ECX_EXTENSIONS_SUPPORTED 0x1 diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index c949a6f..927cfb4 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -124,158 +124,161 @@ static struct cpuidle_state *cpuidle_state_table; * Thus C0 is a dummy. */ static struct cpuidle_state nehalem_cstates[CPUIDLE_STATE_MAX] = { - { /* MWAIT C0 */ }, - { /* MWAIT C1 */ + { .name = "C1-NHM", .desc = "MWAIT 0x00", .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 3, .target_residency = 6, .enter = &intel_idle }, - { /* MWAIT C2 */ + { .name = "C3-NHM", .desc = "MWAIT 0x10", .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 20, .target_residency = 80, .enter = &intel_idle }, - { /* MWAIT C3 */ + { .name = "C6-NHM", .desc = "MWAIT 0x20", .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 200, .target_residency = 800, .enter = &intel_idle }, + { + .enter = NULL } }; static struct cpuidle_state snb_cstates[CPUIDLE_STATE_MAX] = { - { /* MWAIT C0 */ }, - { /* MWAIT C1 */ + { .name = "C1-SNB", .desc = "MWAIT 0x00", .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 1, .target_residency = 1, .enter = &intel_idle }, - { /* MWAIT C2 */ + { .name = "C3-SNB", .desc = "MWAIT 0x10", .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 80, .target_residency = 211, .enter = &intel_idle }, - { /* MWAIT C3 */ + { .name = "C6-SNB", .desc = "MWAIT 0x20", .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 104, .target_residency = 345, .enter = &intel_idle }, - { /* MWAIT C4 */ + { .name = "C7-SNB", .desc = "MWAIT 0x30", .flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 109, .target_residency = 345, .enter = &intel_idle }, + { + .enter = NULL } }; static struct cpuidle_state ivb_cstates[CPUIDLE_STATE_MAX] = { - { /* MWAIT C0 */ }, - { /* MWAIT C1 */ + { .name = "C1-IVB", .desc = "MWAIT 0x00", .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 1, .target_residency = 1, .enter = &intel_idle }, - { /* MWAIT C2 */ + { .name = "C3-IVB", .desc = "MWAIT 0x10", .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 59, .target_residency = 156, .enter = &i
[PATCH 13/16] tools/power turbostat: decode MSR_IA32_POWER_CTL
From: Len Brown When verbose is enabled, print the C1E-Enable bit in MSR_IA32_POWER_CTL. also delete some redundant tests on the verbose variable. Signed-off-by: Len Brown --- arch/x86/include/uapi/asm/msr-index.h | 2 ++ tools/power/x86/turbostat/turbostat.c | 13 +++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index 433a59f..7bdaf7c 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -103,6 +103,8 @@ #define DEBUGCTLMSR_BTS_OFF_USR(1UL << 10) #define DEBUGCTLMSR_FREEZE_LBRS_ON_PMI (1UL << 11) +#define MSR_IA32_POWER_CTL 0x01fc + #define MSR_IA32_MC0_CTL 0x0400 #define MSR_IA32_MC0_STATUS0x0401 #define MSR_IA32_MC0_ADDR 0x0402 diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index b326878..75f64e0 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -908,8 +908,7 @@ void print_verbose_header(void) get_msr(0, MSR_NHM_PLATFORM_INFO, &msr); - if (verbose) - fprintf(stderr, "cpu0: MSR_NHM_PLATFORM_INFO: 0x%08llx\n", msr); + fprintf(stderr, "cpu0: MSR_NHM_PLATFORM_INFO: 0x%08llx\n", msr); ratio = (msr >> 40) & 0xFF; fprintf(stderr, "%d * %.0f = %.0f MHz max efficiency\n", @@ -919,13 +918,16 @@ void print_verbose_header(void) fprintf(stderr, "%d * %.0f = %.0f MHz TSC frequency\n", ratio, bclk, ratio * bclk); + get_msr(0, MSR_IA32_POWER_CTL, &msr); + fprintf(stderr, "cpu0: MSR_IA32_POWER_CTL: 0x%08llx (C1E: %sabled)\n", + msr, msr & 0x2 ? "EN" : "DIS"); + if (!do_ivt_turbo_ratio_limit) goto print_nhm_turbo_ratio_limits; get_msr(0, MSR_IVT_TURBO_RATIO_LIMIT, &msr); - if (verbose) - fprintf(stderr, "cpu0: MSR_IVT_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); + fprintf(stderr, "cpu0: MSR_IVT_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); ratio = (msr >> 56) & 0xFF; if (ratio) @@ -1016,8 +1018,7 @@ print_nhm_turbo_ratio_limits: get_msr(0, MSR_NHM_TURBO_RATIO_LIMIT, &msr); - if (verbose) - fprintf(stderr, "cpu0: MSR_NHM_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); + fprintf(stderr, "cpu0: MSR_NHM_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); ratio = (msr >> 56) & 0xFF; if (ratio) -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/16] intel_idle: support Haswell
From: Len Brown This patch enables intel_idle to run on the next-generation Intel(R) Microarchitecture code named "Haswell". Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b2cf489..fa71477 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -212,6 +212,38 @@ static struct cpuidle_state ivb_cstates[MWAIT_MAX_NUM_CSTATES] = { .enter = &intel_idle }, }; +static struct cpuidle_state hsw_cstates[MWAIT_MAX_NUM_CSTATES] = { + { /* MWAIT C0 */ }, + { /* MWAIT C1 */ + .name = "C1-HSW", + .desc = "MWAIT 0x00", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 2, + .target_residency = 2, + .enter = &intel_idle }, + { /* MWAIT C2 */ + .name = "C3-HSW", + .desc = "MWAIT 0x10", + .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 33, + .target_residency = 100, + .enter = &intel_idle }, + { /* MWAIT C3 */ + .name = "C6-HSW", + .desc = "MWAIT 0x20", + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 133, + .target_residency = 400, + .enter = &intel_idle }, + { /* MWAIT C4 */ + .name = "C7s-HSW", + .desc = "MWAIT 0x32", + .flags = MWAIT2flg(0x32) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 166, + .target_residency = 500, + .enter = &intel_idle }, +}; + static struct cpuidle_state atom_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ @@ -365,6 +397,10 @@ static const struct idle_cpu idle_cpu_ivb = { .state_table = ivb_cstates, }; +static const struct idle_cpu idle_cpu_hsw = { + .state_table = hsw_cstates, +}; + #define ICPU(model, cpu) \ { X86_VENDOR_INTEL, 6, model, X86_FEATURE_MWAIT, (unsigned long)&cpu } @@ -382,6 +418,9 @@ static const struct x86_cpu_id intel_idle_ids[] = { ICPU(0x2d, idle_cpu_snb), ICPU(0x3a, idle_cpu_ivb), ICPU(0x3e, idle_cpu_ivb), + ICPU(0x3c, idle_cpu_hsw), + ICPU(0x3f, idle_cpu_hsw), + ICPU(0x45, idle_cpu_hsw), {} }; MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/16] ACPI / idle: remove usage of the statedata
From: Daniel Lezcano Len Brown sent a patch to remove this field in the intel_idle driver. The other user of this field is the davinci cpuidle driver and a patch has been sent to remove the usage of it. This patch removes the last user of this field. Signed-off-by: Daniel Lezcano Signed-off-by: Len Brown --- drivers/acpi/processor_idle.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 6837065..8b433cb 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -66,6 +66,8 @@ module_param(latency_factor, uint, 0644); static DEFINE_PER_CPU(struct cpuidle_device *, acpi_cpuidle_device); +static struct acpi_processor_cx *acpi_cstate[CPUIDLE_STATE_MAX]; + static int disabled_by_idle_boot_param(void) { return boot_option_idle_override == IDLE_POLL || @@ -721,8 +723,7 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { struct acpi_processor *pr; - struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; - struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage); + struct acpi_processor_cx *cx = acpi_cstate[index]; pr = __this_cpu_read(processors); @@ -745,8 +746,7 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev, */ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index) { - struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; - struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage); + struct acpi_processor_cx *cx = acpi_cstate[index]; ACPI_FLUSH_CPU_CACHE(); @@ -776,8 +776,7 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { struct acpi_processor *pr; - struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; - struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage); + struct acpi_processor_cx *cx = acpi_cstate[index]; pr = __this_cpu_read(processors); @@ -835,8 +834,7 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { struct acpi_processor *pr; - struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; - struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage); + struct acpi_processor_cx *cx = acpi_cstate[index]; pr = __this_cpu_read(processors); @@ -935,7 +933,6 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr, { int i, count = CPUIDLE_DRIVER_STATE_START; struct acpi_processor_cx *cx; - struct cpuidle_state_usage *state_usage; if (!pr->flags.power_setup_done) return -EINVAL; @@ -954,7 +951,6 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr, for (i = 1; i < ACPI_PROCESSOR_MAX_POWER && i <= max_cstate; i++) { cx = &pr->power.states[i]; - state_usage = &dev->states_usage[count]; if (!cx->valid) continue; @@ -965,8 +961,7 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr, !(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED)) continue; #endif - - cpuidle_set_statedata(state_usage, cx); + acpi_cstate[count] = cx; count++; if (count == CPUIDLE_STATE_MAX) -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
idle patches queued for Linux-3.9
The following is my idle patch queue for Linux-3.9. As noted below, some patches are unchanged since last seen on the list, some are re-freshed, and some are new. Please let me know if you see troubles with any of them. thanks, Len Brown, Intel Open Source Technology Center [PATCH 01/16] intel_idle: stop using driver_data for static flags [PATCH 02/16] Replace the flag by a simple global boolean in the [PATCH 03/16] davinci: cpuidle - move code to prevent forward [PATCH 04/16] davinci: cpuidle - remove the ops [PATCH 05/16] davinci: cpuidle - remove useless initialization [PATCH 06/16] ACPI / idle: remove unused definition [PATCH 07/16] ACPI / idle : remove pointless headers [PATCH 08/16] ACPI / idle: pass the cpuidle_device parameter [PATCH 09/16] ACPI / idle: remove usage of the statedata Unchanged since most recent on list. [PATCH 10/16] cpuidle: remove vestage definition of statedata New [PATCH 11/16] intel_idle: support Haswell Refreshed [PATCH 12/16] tools/power turbostat: support Haswell [PATCH 13/16] tools/power turbostat: decode MSR_IA32_POWER_CTL Unchanged [PATCH 14/16] intel_idle: remove use and definition of MWAIT_MAX_CSTATE [PATCH 15/16] intel_idle: remove assumption of one C-state per MWAIT [PATCH 16/16] intel_idle: export both C1 and C1E New -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/16] davinci: cpuidle - move code to prevent forward declaration
From: Daniel Lezcano The patch is mindless, it just moves the idle function below in the file in order to prevent forward declaration in the next patch. Signed-off-by: Daniel Lezcano Acked-by: Sekhar Nori Signed-off-by: Len Brown --- arch/arm/mach-davinci/cpuidle.c | 72 - 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/arch/arm/mach-davinci/cpuidle.c b/arch/arm/mach-davinci/cpuidle.c index 5fbd470..697febe 100644 --- a/arch/arm/mach-davinci/cpuidle.c +++ b/arch/arm/mach-davinci/cpuidle.c @@ -31,42 +31,6 @@ struct davinci_ops { u32 flags; }; -/* Actual code that puts the SoC in different idle states */ -static int davinci_enter_idle(struct cpuidle_device *dev, - struct cpuidle_driver *drv, - int index) -{ - struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; - struct davinci_ops *ops = cpuidle_get_statedata(state_usage); - - if (ops && ops->enter) - ops->enter(); - - index = cpuidle_wrap_enter(dev, drv, index, - arm_cpuidle_simple_enter); - - if (ops && ops->exit) - ops->exit(); - - return index; -} - -static struct cpuidle_driver davinci_idle_driver = { - .name = "cpuidle-davinci", - .owner = THIS_MODULE, - .en_core_tk_irqen = 1, - .states[0] = ARM_CPUIDLE_WFI_STATE, - .states[1] = { - .enter = davinci_enter_idle, - .exit_latency = 10, - .target_residency = 10, - .flags = CPUIDLE_FLAG_TIME_VALID, - .name = "DDR SR", - .desc = "WFI and DDR Self Refresh", - }, - .state_count = DAVINCI_CPUIDLE_MAX_STATES, -}; - static DEFINE_PER_CPU(struct cpuidle_device, davinci_cpuidle_device); static void __iomem *ddr2_reg_base; static bool ddr2_pdown; @@ -107,6 +71,42 @@ static struct davinci_ops davinci_states[DAVINCI_CPUIDLE_MAX_STATES] = { }, }; +/* Actual code that puts the SoC in different idle states */ +static int davinci_enter_idle(struct cpuidle_device *dev, + struct cpuidle_driver *drv, + int index) +{ + struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; + struct davinci_ops *ops = cpuidle_get_statedata(state_usage); + + if (ops && ops->enter) + ops->enter(); + + index = cpuidle_wrap_enter(dev, drv, index, + arm_cpuidle_simple_enter); + + if (ops && ops->exit) + ops->exit(); + + return index; +} + +static struct cpuidle_driver davinci_idle_driver = { + .name = "cpuidle-davinci", + .owner = THIS_MODULE, + .en_core_tk_irqen = 1, + .states[0] = ARM_CPUIDLE_WFI_STATE, + .states[1] = { + .enter = davinci_enter_idle, + .exit_latency = 10, + .target_residency = 10, + .flags = CPUIDLE_FLAG_TIME_VALID, + .name = "DDR SR", + .desc = "WFI and DDR Self Refresh", + }, + .state_count = DAVINCI_CPUIDLE_MAX_STATES, +}; + static int __init davinci_cpuidle_probe(struct platform_device *pdev) { int ret; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/16] davinci: cpuidle - remove useless initialization
From: Daniel Lezcano The device->state_count is initialized in the cpuidle_register_device function. Signed-off-by: Daniel Lezcano Acked-by: Sekhar Nori Signed-off-by: Len Brown --- arch/arm/mach-davinci/cpuidle.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/arm/mach-davinci/cpuidle.c b/arch/arm/mach-davinci/cpuidle.c index 5e430bf..5ac9e93 100644 --- a/arch/arm/mach-davinci/cpuidle.c +++ b/arch/arm/mach-davinci/cpuidle.c @@ -96,8 +96,6 @@ static int __init davinci_cpuidle_probe(struct platform_device *pdev) ddr2_pdown = pdata->ddr2_pdown; - device->state_count = DAVINCI_CPUIDLE_MAX_STATES; - ret = cpuidle_register_driver(&davinci_idle_driver); if (ret) { dev_err(&pdev->dev, "failed to register driver\n"); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 04/16] davinci: cpuidle - remove the ops
From: Daniel Lezcano With one function handling the idle state and a single variable, the usage of the davinci_ops is overkill. This patch removes these ops and simplify the code. Furthermore, the 'driver_data' field is no longer used, we have 1 of the 3 remaining user of this field removed. Signed-off-by: Daniel Lezcano Acked-by: Sekhar Nori Signed-off-by: Len Brown --- arch/arm/mach-davinci/cpuidle.c | 33 ++--- 1 file changed, 2 insertions(+), 31 deletions(-) diff --git a/arch/arm/mach-davinci/cpuidle.c b/arch/arm/mach-davinci/cpuidle.c index 697febe..5e430bf 100644 --- a/arch/arm/mach-davinci/cpuidle.c +++ b/arch/arm/mach-davinci/cpuidle.c @@ -25,12 +25,6 @@ #define DAVINCI_CPUIDLE_MAX_STATES 2 -struct davinci_ops { - void (*enter) (void); - void (*exit) (void); - u32 flags; -}; - static DEFINE_PER_CPU(struct cpuidle_device, davinci_cpuidle_device); static void __iomem *ddr2_reg_base; static bool ddr2_pdown; @@ -54,39 +48,17 @@ static void davinci_save_ddr_power(int enter, bool pdown) __raw_writel(val, ddr2_reg_base + DDR2_SDRCR_OFFSET); } -static void davinci_c2state_enter(void) -{ - davinci_save_ddr_power(1, ddr2_pdown); -} - -static void davinci_c2state_exit(void) -{ - davinci_save_ddr_power(0, ddr2_pdown); -} - -static struct davinci_ops davinci_states[DAVINCI_CPUIDLE_MAX_STATES] = { - [1] = { - .enter = davinci_c2state_enter, - .exit = davinci_c2state_exit, - }, -}; - /* Actual code that puts the SoC in different idle states */ static int davinci_enter_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - struct cpuidle_state_usage *state_usage = &dev->states_usage[index]; - struct davinci_ops *ops = cpuidle_get_statedata(state_usage); - - if (ops && ops->enter) - ops->enter(); + davinci_save_ddr_power(1, ddr2_pdown); index = cpuidle_wrap_enter(dev, drv, index, arm_cpuidle_simple_enter); - if (ops && ops->exit) - ops->exit(); + davinci_save_ddr_power(0, ddr2_pdown); return index; } @@ -123,7 +95,6 @@ static int __init davinci_cpuidle_probe(struct platform_device *pdev) ddr2_reg_base = pdata->ddr2_ctlr_base; ddr2_pdown = pdata->ddr2_pdown; - cpuidle_set_statedata(&device->states_usage[1], &davinci_states[1]); device->state_count = DAVINCI_CPUIDLE_MAX_STATES; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/16] ACPI / idle: pass the cpuidle_device parameter
From: Daniel Lezcano The cpuidle_device is retrieved in the function by using directly the global variable. But the caller of this function already have this device and it can be passed as a parameter. That is one small step to encapsulate the code more. Signed-off-by: Daniel Lezcano Signed-off-by: Len Brown --- drivers/acpi/processor_idle.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 82626e9..6837065 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -928,13 +928,14 @@ struct cpuidle_driver acpi_idle_driver = { * device i.e. per-cpu data * * @pr: the ACPI processor + * @dev : the cpuidle device */ -static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr) +static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr, + struct cpuidle_device *dev) { int i, count = CPUIDLE_DRIVER_STATE_START; struct acpi_processor_cx *cx; struct cpuidle_state_usage *state_usage; - struct cpuidle_device *dev = per_cpu(acpi_cpuidle_device, pr->id); if (!pr->flags.power_setup_done) return -EINVAL; @@ -1089,7 +1090,7 @@ int acpi_processor_hotplug(struct acpi_processor *pr) cpuidle_disable_device(dev); acpi_processor_get_power_info(pr); if (pr->flags.power) { - acpi_processor_setup_cpuidle_cx(pr); + acpi_processor_setup_cpuidle_cx(pr, dev); ret = cpuidle_enable_device(dev); } cpuidle_resume_and_unlock(); @@ -1147,8 +1148,8 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr) continue; acpi_processor_get_power_info(_pr); if (_pr->flags.power) { - acpi_processor_setup_cpuidle_cx(_pr); dev = per_cpu(acpi_cpuidle_device, cpu); + acpi_processor_setup_cpuidle_cx(_pr, dev); cpuidle_enable_device(dev); } } @@ -1217,7 +1218,7 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr) return -ENOMEM; per_cpu(acpi_cpuidle_device, pr->id) = dev; - acpi_processor_setup_cpuidle_cx(pr); + acpi_processor_setup_cpuidle_cx(pr, dev); /* Register per-cpu cpuidle_device. Cpuidle driver * must already be registered before registering device -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/16] ACPI / idle : remove pointless headers
From: Daniel Lezcano These different headers are not needed. Signed-off-by: Daniel Lezcano Signed-off-by: Len Brown --- drivers/acpi/processor_idle.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 5cb5f24..82626e9 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -28,19 +28,12 @@ * ~~ */ -#include #include -#include -#include -#include #include #include -#include -#include/* need_resched() */ -#include +#include/* need_resched() */ #include #include -#include /* * Include the apic definitions for x86 to have the APIC timer related defines @@ -52,12 +45,8 @@ #include #endif -#include -#include - #include #include -#include #define PREFIX "ACPI: " -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/16] ACPI / idle: remove unused definition
From: Daniel Lezcano The different definitions are not used anywhere in the code. Signed-off-by: Daniel Lezcano Signed-off-by: Len Brown --- drivers/acpi/processor_idle.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index ed9a1cc..5cb5f24 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -64,10 +64,6 @@ #define ACPI_PROCESSOR_CLASS"processor" #define _COMPONENT ACPI_PROCESSOR_COMPONENT ACPI_MODULE_NAME("processor_idle"); -#define PM_TIMER_TICK_NS (10ULL/PM_TIMER_FREQUENCY) -#define C2_OVERHEAD1 /* 1us */ -#define C3_OVERHEAD1 /* 1us */ -#define PM_TIMER_TICKS_TO_US(p)(((p) * 1000)/(PM_TIMER_FREQUENCY/1000)) static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER; module_param(max_cstate, uint, ); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 01/16] APM idle: register apm_cpu_idle via cpuidle
From: Len Brown Update APM to register its local idle routine with cpuidle. This allows us to stop exporting pm_idle to modules on x86. The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE. Compile-tested only. Signed-off-by: Len Brown Cc: Jiri Kosina --- arch/x86/Kconfig | 1 + arch/x86/kernel/apm_32.c | 58 +-- arch/x86/kernel/process.c | 3 --- 3 files changed, 37 insertions(+), 25 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 225543b..1b63586 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1912,6 +1912,7 @@ config APM_DO_ENABLE this feature. config APM_CPU_IDLE + depends on CPU_IDLE bool "Make CPU Idle calls when idle" ---help--- Enable calls to APM CPU Idle/CPU Busy inside the kernel's idle loop. diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c index d65464e..18a3cc3 100644 --- a/arch/x86/kernel/apm_32.c +++ b/arch/x86/kernel/apm_32.c @@ -232,6 +232,7 @@ #include #include #include +#include #include #include @@ -360,13 +361,37 @@ struct apm_user { * idle percentage above which bios idle calls are done */ #ifdef CONFIG_APM_CPU_IDLE -#warning deprecated CONFIG_APM_CPU_IDLE will be deleted in 2012 #define DEFAULT_IDLE_THRESHOLD 95 #else #define DEFAULT_IDLE_THRESHOLD 100 #endif #define DEFAULT_IDLE_PERIOD(100 / 3) +static int apm_cpu_idle(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index); + +static struct cpuidle_driver apm_idle_driver = { + .name = "apm_idle", + .owner = THIS_MODULE, + .en_core_tk_irqen = 1, + .states = { + { /* entry 0 is for polling */ }, + { /* entry 1 is for APM idle */ + .name = "APM", + .desc = "APM idle", + .flags = CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 250,/* WAG */ + .target_residency = 500,/* WAG */ + .enter = &apm_cpu_idle + }, + }, +}; + +static struct cpuidle_device apm_cpuidle_device = { + .state_count = 2, +}; + + /* * Local variables */ @@ -377,7 +402,6 @@ static struct { static int clock_slowed; static int idle_threshold __read_mostly = DEFAULT_IDLE_THRESHOLD; static int idle_period __read_mostly = DEFAULT_IDLE_PERIOD; -static int set_pm_idle; static int suspends_pending; static int standbys_pending; static int ignore_sys_suspend; @@ -884,8 +908,6 @@ static void apm_do_busy(void) #define IDLE_CALC_LIMIT(HZ * 100) #define IDLE_LEAKY_MAX 16 -static void (*original_pm_idle)(void) __read_mostly; - /** * apm_cpu_idle- cpu idling for APM capable Linux * @@ -894,7 +916,8 @@ static void (*original_pm_idle)(void) __read_mostly; * Furthermore it calls the system default idle routine. */ -static void apm_cpu_idle(void) +static int apm_cpu_idle(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) { static int use_apm_idle; /* = 0 */ static unsigned int last_jiffies; /* = 0 */ @@ -904,7 +927,6 @@ static void apm_cpu_idle(void) unsigned int jiffies_since_last_check = jiffies - last_jiffies; unsigned int bucket; - WARN_ONCE(1, "deprecated apm_cpu_idle will be deleted in 2012"); recalc: if (jiffies_since_last_check > IDLE_CALC_LIMIT) { use_apm_idle = 0; @@ -950,10 +972,7 @@ recalc: break; } } - if (original_pm_idle) - original_pm_idle(); - else - default_idle(); + default_idle(); local_irq_disable(); jiffies_since_last_check = jiffies - last_jiffies; if (jiffies_since_last_check > idle_period) @@ -964,6 +983,7 @@ recalc: apm_do_busy(); local_irq_enable(); + return index; } /** @@ -2381,9 +2401,9 @@ static int __init apm_init(void) if (HZ != 100) idle_period = (idle_period * HZ) / 100; if (idle_threshold < 100) { - original_pm_idle = pm_idle; - pm_idle = apm_cpu_idle; - set_pm_idle = 1; + if (!cpuidle_register_driver(&apm_idle_driver)) + if (cpuidle_register_device(&apm_cpuidle_device)) + cpuidle_unregister_driver(&apm_idle_driver); } return 0; @@ -2393,15 +2413,9 @@ static void __exit apm_exit(void) { int error; - if (set_pm_idle) { - pm_idle = original_pm_idle; - /* -* We are about to unload the current idle thread pm callback -* (pm_id
[PATCH 04/16] sparc idle: rename pm_idle to sparc_idle
From: Len Brown (pm_idle)() is being removed from linux/pm.h because Linux does not have such a cross-architecture concept. sparc uses an idle function pointer in its architecture specific code. So we re-name sparc use of pm_idle to sparc_idle. Maybe some day, SPARC will cut over to cpuidle... Signed-off-by: Len Brown Cc: sparcli...@vger.kernel.org --- arch/sparc/include/asm/processor.h | 1 + arch/sparc/kernel/apc.c| 3 ++- arch/sparc/kernel/leon_pmc.c | 5 +++-- arch/sparc/kernel/pmc.c| 3 ++- arch/sparc/kernel/process_32.c | 7 +++ 5 files changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/sparc/include/asm/processor.h b/arch/sparc/include/asm/processor.h index 2fe99e6..34baa35 100644 --- a/arch/sparc/include/asm/processor.h +++ b/arch/sparc/include/asm/processor.h @@ -7,5 +7,6 @@ #endif #define nop() __asm__ __volatile__ ("nop") +extern void (*sparc_idle)(void); #endif diff --git a/arch/sparc/kernel/apc.c b/arch/sparc/kernel/apc.c index 348fa1a..eefda32 100644 --- a/arch/sparc/kernel/apc.c +++ b/arch/sparc/kernel/apc.c @@ -20,6 +20,7 @@ #include #include #include +#include /* Debugging * @@ -158,7 +159,7 @@ static int apc_probe(struct platform_device *op) /* Assign power management IDLE handler */ if (!apc_no_idle) - pm_idle = apc_swift_idle; + sparc_idle = apc_swift_idle; printk(KERN_INFO "%s: power management initialized%s\n", APC_DEVNAME, apc_no_idle ? " (CPU idle disabled)" : ""); diff --git a/arch/sparc/kernel/leon_pmc.c b/arch/sparc/kernel/leon_pmc.c index 4e17432..708bca4 100644 --- a/arch/sparc/kernel/leon_pmc.c +++ b/arch/sparc/kernel/leon_pmc.c @@ -9,6 +9,7 @@ #include #include #include +#include /* List of Systems that need fixup instructions around power-down instruction */ unsigned int pmc_leon_fixup_ids[] = { @@ -69,9 +70,9 @@ static int __init leon_pmc_install(void) if (sparc_cpu_model == sparc_leon) { /* Assign power management IDLE handler */ if (pmc_leon_need_fixup()) - pm_idle = pmc_leon_idle_fixup; + sparc_idle = pmc_leon_idle_fixup; else - pm_idle = pmc_leon_idle; + sparc_idle = pmc_leon_idle; printk(KERN_INFO "leon: power management initialized\n"); } diff --git a/arch/sparc/kernel/pmc.c b/arch/sparc/kernel/pmc.c index dcbb62f..8b7297f 100644 --- a/arch/sparc/kernel/pmc.c +++ b/arch/sparc/kernel/pmc.c @@ -17,6 +17,7 @@ #include #include #include +#include /* Debug * @@ -63,7 +64,7 @@ static int pmc_probe(struct platform_device *op) #ifndef PMC_NO_IDLE /* Assign power management IDLE handler */ - pm_idle = pmc_swift_idle; + sparc_idle = pmc_swift_idle; #endif printk(KERN_INFO "%s: power management initialized\n", PMC_DEVNAME); diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c index be8e862..62eede1 100644 --- a/arch/sparc/kernel/process_32.c +++ b/arch/sparc/kernel/process_32.c @@ -43,8 +43,7 @@ * Power management idle function * Set in pm platform drivers (apc.c and pmc.c) */ -void (*pm_idle)(void); -EXPORT_SYMBOL(pm_idle); +void (*sparc_idle)(void); /* * Power-off handler instantiation for pm.h compliance @@ -75,8 +74,8 @@ void cpu_idle(void) /* endless idle loop with no priority at all */ for (;;) { while (!need_resched()) { - if (pm_idle) - (*pm_idle)(); + if (sparc_idle) + (*sparc_idle)(); else cpu_relax(); } -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/16] m32r idle: delete pm_idle, and other dead idle code
From: Len Brown All paths on m32r lead to cpu_relax(). So delete the dead code and simply call cpu_relax() directly. Signed-off-by: Len Brown Cc: linux-m...@ml.linux-m32r.org --- arch/m32r/kernel/process.c | 51 ++ 1 file changed, 2 insertions(+), 49 deletions(-) diff --git a/arch/m32r/kernel/process.c b/arch/m32r/kernel/process.c index 765d0f5..bde899e 100644 --- a/arch/m32r/kernel/process.c +++ b/arch/m32r/kernel/process.c @@ -44,36 +44,10 @@ unsigned long thread_saved_pc(struct task_struct *tsk) return tsk->thread.lr; } -/* - * Powermanagement idle function, if any.. - */ -static void (*pm_idle)(void) = NULL; - void (*pm_power_off)(void) = NULL; EXPORT_SYMBOL(pm_power_off); /* - * We use this is we don't have any better - * idle routine.. - */ -static void default_idle(void) -{ - /* M32R_FIXME: Please use "cpu_sleep" mode. */ - cpu_relax(); -} - -/* - * On SMP it's slightly faster (but much more power-consuming!) - * to poll the ->work.need_resched flag instead of waiting for the - * cross-CPU IPI to arrive. Use this option with caution. - */ -static void poll_idle (void) -{ - /* M32R_FIXME */ - cpu_relax(); -} - -/* * The idle thread. There's no useful work to be * done, so just try to conserve power and have a * low exit latency (ie sit in a loop waiting for @@ -84,14 +58,8 @@ void cpu_idle (void) /* endless idle loop with no priority at all */ while (1) { rcu_idle_enter(); - while (!need_resched()) { - void (*idle)(void) = pm_idle; - - if (!idle) - idle = default_idle; - - idle(); - } + while (!need_resched()) + cpu_relax(); rcu_idle_exit(); schedule_preempt_disabled(); } @@ -120,21 +88,6 @@ void machine_power_off(void) /* M32R_FIXME */ } -static int __init idle_setup (char *str) -{ - if (!strncmp(str, "poll", 4)) { - printk("using poll in idle threads.\n"); - pm_idle = poll_idle; - } else if (!strncmp(str, "sleep", 4)) { - printk("using sleep in idle threads.\n"); - pm_idle = default_idle; - } - - return 1; -} - -__setup("idle=", idle_setup); - void show_regs(struct pt_regs * regs) { printk("\n"); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/16] xen idle: make xen-specific macro xen-specific
From: Len Brown This macro is only invoked by Xen, so make its definition specific to Xen. > set_pm_idle_to_default() < xen_set_default_idle() Signed-off-by: Len Brown Cc: xen-de...@lists.xensource.com --- arch/x86/include/asm/processor.h | 6 +- arch/x86/kernel/process.c| 4 +++- arch/x86/xen/setup.c | 2 +- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 888184b..c2f7f47 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -998,7 +998,11 @@ extern unsigned long arch_align_stack(unsigned long sp); extern void free_init_pages(char *what, unsigned long begin, unsigned long end); void default_idle(void); -bool set_pm_idle_to_default(void); +#ifdef CONFIG_XEN +bool xen_set_default_idle(void); +#else +#define xen_set_default_idle 0 +#endif void stop_this_cpu(void *dummy); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index ceb05db..b86de21 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -390,7 +390,8 @@ void default_idle(void) EXPORT_SYMBOL(default_idle); #endif -bool set_pm_idle_to_default(void) +#ifdef CONFIG_XEN +bool xen_set_default_idle(void) { bool ret = !!x86_idle; @@ -398,6 +399,7 @@ bool set_pm_idle_to_default(void) return ret; } +#endif void stop_this_cpu(void *dummy) { local_irq_disable(); diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 8971a26..2b73b5c 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -561,7 +561,7 @@ void __init xen_arch_setup(void) #endif disable_cpuidle(); disable_cpufreq(); - WARN_ON(set_pm_idle_to_default()); + WARN_ON(xen_set_default_idle()); fiddle_vdso(); #ifdef CONFIG_NUMA numa_off = 1; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/16] ARM64 idle: delete pm_idle
From: Len Brown pm_idle() on arm64 was a synonym for default_idle(), so remove it and invoke default_idle() directly. Signed-off-by: Len Brown Cc: linux-arm-ker...@lists.infradead.org --- arch/arm64/kernel/process.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index cb0956b..c7002d4 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -97,14 +97,9 @@ static void default_idle(void) local_irq_enable(); } -void (*pm_idle)(void) = default_idle; -EXPORT_SYMBOL_GPL(pm_idle); - /* - * The idle thread, has rather strange semantics for calling pm_idle, - * but this is what x86 does and we need to do the same, so that - * things like cpuidle get called in the same way. The only difference - * is that we always respect 'hlt_counter' to prevent low power idle. + * The idle thread. + * We always respect 'hlt_counter' to prevent low power idle. */ void cpu_idle(void) { @@ -122,10 +117,10 @@ void cpu_idle(void) local_irq_disable(); if (!need_resched()) { stop_critical_timings(); - pm_idle(); + default_idle(); start_critical_timings(); /* -* pm_idle functions should always return +* default_idle functions should always return * with IRQs enabled. */ WARN_ON(irqs_disabled()); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/16] cris idle: delete idle and pm_idle
From: Len Brown pm_idle() and idle() served no purpose on cris -- invoke default_idle() directly. Signed-off-by: Len Brown Cc: linux-cris-ker...@axis.com --- arch/cris/kernel/process.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/arch/cris/kernel/process.c b/arch/cris/kernel/process.c index 7f65be6..104ff4d 100644 --- a/arch/cris/kernel/process.c +++ b/arch/cris/kernel/process.c @@ -54,11 +54,6 @@ void enable_hlt(void) EXPORT_SYMBOL(enable_hlt); -/* - * The following aren't currently used. - */ -void (*pm_idle)(void); - extern void default_idle(void); void (*pm_power_off)(void); @@ -77,16 +72,12 @@ void cpu_idle (void) while (1) { rcu_idle_enter(); while (!need_resched()) { - void (*idle)(void); /* * Mark this as an RCU critical section so that * synchronize_kernel() in the unload path waits * for our completion. */ - idle = pm_idle; - if (!idle) - idle = default_idle; - idle(); + default_idle(); } rcu_idle_exit(); schedule_preempt_disabled(); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 15/16] PM idle: remove global declaration of pm_idle
From: Len Brown pm_idle appears in no generic Linux code, it appears only in architecture-specific code. Thus, pm_idle should not be declared in pm.h. Architectures that use an idle function pointer should delcare one local to their architecture, and/or use cpuidle. Signed-off-by: Len Brown Cc: linux...@vger.kernel.org --- include/linux/pm.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/pm.h b/include/linux/pm.h index 03d7bb1..97bcf23 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -31,7 +31,6 @@ /* * Callbacks for platform drivers to implement. */ -extern void (*pm_idle)(void); extern void (*pm_power_off)(void); extern void (*pm_power_off_prepare)(void); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 14/16] unicore32 idle: delete stray pm_idle comment
From: Len Brown as pm_idle() has already been deleted from this code, the comment was a stray. Signed-off-by: Len Brown Cc: Guan Xuetao --- arch/unicore32/kernel/process.c | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 62bad9f..872d7e2 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -45,11 +45,6 @@ static const char * const processor_modes[] = { "UK18", "UK19", "UK1A", "EXTN", "UK1C", "UK1D", "UK1E", "SUSR" }; -/* - * The idle thread, has rather strange semantics for calling pm_idle, - * but this is what x86 does and we need to do the same, so that - * things like cpuidle get called in the same way. - */ void cpu_idle(void) { /* endless idle loop with no priority at all */ -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/16] openrisc idle: delete pm_idle
From: Len Brown pm_idle() on openrisc was dead code. Signed-off-by: Len Brown Cc: li...@lists.openrisc.net --- arch/openrisc/kernel/idle.c | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/openrisc/kernel/idle.c b/arch/openrisc/kernel/idle.c index 7d618fe..5e8a3b6 100644 --- a/arch/openrisc/kernel/idle.c +++ b/arch/openrisc/kernel/idle.c @@ -39,11 +39,6 @@ void (*powersave) (void) = NULL; -static inline void pm_idle(void) -{ - barrier(); -} - void cpu_idle(void) { set_thread_flag(TIF_POLLING_NRFLAG); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/16] sh idle: rename global pm_idle to static sh_idle
From: Len Brown SH idle code could use some simplification. This patch enables that by guaranteeing that "sh_idle" is local, and thus architecture specific. Signed-off-by: Len Brown Cc: linux...@vger.kernel.org --- arch/sh/kernel/idle.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c index 0c91016..3d5a1b3 100644 --- a/arch/sh/kernel/idle.c +++ b/arch/sh/kernel/idle.c @@ -22,7 +22,7 @@ #include #include -void (*pm_idle)(void); +static void (*sh_idle)(void); static int hlt_counter; @@ -103,9 +103,9 @@ void cpu_idle(void) /* Don't trace irqs off for idle */ stop_critical_timings(); if (cpuidle_idle_call()) - pm_idle(); + sh_idle(); /* -* Sanity check to ensure that pm_idle() returns +* Sanity check to ensure that sh_idle() returns * with IRQs enabled */ WARN_ON(irqs_disabled()); @@ -123,13 +123,13 @@ void __init select_idle_routine(void) /* * If a platform has set its own idle routine, leave it alone. */ - if (pm_idle) + if (sh_idle) return; if (hlt_works()) - pm_idle = default_idle; + sh_idle = default_idle; else - pm_idle = poll_idle; + sh_idle = poll_idle; } void stop_this_cpu(void *unused) -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/16] ia64 idle: delete pm_idle
From: Len Brown pm_idle() on ia64 was a synonym for default_idle(). So simply invoke default_idle() directly. Signed-off-by: Len Brown Cc: linux-i...@vger.kernel.org --- arch/ia64/kernel/process.c | 3 --- arch/ia64/kernel/setup.c | 1 - 2 files changed, 4 deletions(-) diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c index 31360cb..e34f565 100644 --- a/arch/ia64/kernel/process.c +++ b/arch/ia64/kernel/process.c @@ -57,8 +57,6 @@ void (*ia64_mark_idle)(int); unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE; EXPORT_SYMBOL(boot_option_idle_override); -void (*pm_idle) (void); -EXPORT_SYMBOL(pm_idle); void (*pm_power_off) (void); EXPORT_SYMBOL(pm_power_off); @@ -301,7 +299,6 @@ cpu_idle (void) if (mark_idle) (*mark_idle)(1); - idle = pm_idle; if (!idle) idle = default_idle; (*idle)(); diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c index aaefd9b..2029cc0 100644 --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -1051,7 +1051,6 @@ cpu_init (void) max_num_phys_stacked = num_phys_stacked; } platform_cpu_init(); - pm_idle = default_idle; } void __init -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/16] mn10300 idle: delete pm_idle
From: Len Brown pm_idle on mn10300 served no purpose. Signed-off-by: Len Brown Cc: linux-am33-l...@redhat.com --- arch/mn10300/kernel/process.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/arch/mn10300/kernel/process.c b/arch/mn10300/kernel/process.c index eb09f5a..84f4e97 100644 --- a/arch/mn10300/kernel/process.c +++ b/arch/mn10300/kernel/process.c @@ -37,12 +37,6 @@ #include "internal.h" /* - * power management idle function, if any.. - */ -void (*pm_idle)(void); -EXPORT_SYMBOL(pm_idle); - -/* * return saved PC of a blocked thread. */ unsigned long thread_saved_pc(struct task_struct *tsk) @@ -113,7 +107,6 @@ void cpu_idle(void) void (*idle)(void); smp_rmb(); - idle = pm_idle; if (!idle) { #if defined(CONFIG_SMP) && !defined(CONFIG_HOTPLUG_CPU) idle = poll_idle; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/16] microblaze idle: delete pm_idle
From: Len Brown pm_idle on microblaze served no purpose. Signed-off-by: Len Brown Cc: microblaze-ucli...@itee.uq.edu.au --- arch/microblaze/kernel/process.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c index a5b74f7..6ff2dcf 100644 --- a/arch/microblaze/kernel/process.c +++ b/arch/microblaze/kernel/process.c @@ -41,7 +41,6 @@ void show_regs(struct pt_regs *regs) regs->msr, regs->ear, regs->esr, regs->fsr); } -void (*pm_idle)(void); void (*pm_power_off)(void) = NULL; EXPORT_SYMBOL(pm_power_off); @@ -98,8 +97,6 @@ void cpu_idle(void) /* endless idle loop with no priority at all */ while (1) { - void (*idle)(void) = pm_idle; - if (!idle) idle = default_idle; -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/16] ARM idle: delete pm_idle
From: Len Brown pm_idle() on ARM was a synonym for default_idle(), so simply invoke default_idle() directly. Signed-off-by: Len Brown Cc: linux-arm-ker...@lists.infradead.org --- arch/arm/kernel/process.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index c6dec5f..047d3e4 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -172,14 +172,9 @@ static void default_idle(void) local_irq_enable(); } -void (*pm_idle)(void) = default_idle; -EXPORT_SYMBOL(pm_idle); - /* - * The idle thread, has rather strange semantics for calling pm_idle, - * but this is what x86 does and we need to do the same, so that - * things like cpuidle get called in the same way. The only difference - * is that we always respect 'hlt_counter' to prevent low power idle. + * The idle thread. + * We always respect 'hlt_counter' to prevent low power idle. */ void cpu_idle(void) { @@ -210,10 +205,10 @@ void cpu_idle(void) } else if (!need_resched()) { stop_critical_timings(); if (cpuidle_idle_call()) - pm_idle(); + default_idle(); start_critical_timings(); /* -* pm_idle functions must always +* default_idle functions must always * return with IRQs enabled. */ WARN_ON(irqs_disabled()); -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
pm_idle cleanup patch series
Use of (pm_idle)() spread from x86 to a number of architectures. Some used it like x86, but most copied it for no functional reason. There is no Linux architecture-independent code that mandates that an architecture supply a pm_idle(). So we delete it from pm.h and we either remove it or re-name it to be private to each architecture. It is a safe bet that I fat-fingered something in this series. Please send me an Ack if you can build/test or I broke your favorite machine. In particular, for those with computer museums, I'd like help testing the APM patch, which is compile-tested only. thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/16] blackfin idle: delete pm_idle
From: Len Brown pm_idle is dead code on blackfin. Signed-off-by: Len Brown Cc: uclinux-dist-de...@blackfin.uclinux.org --- arch/blackfin/kernel/process.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/arch/blackfin/kernel/process.c b/arch/blackfin/kernel/process.c index 3e16ad9..8061426 100644 --- a/arch/blackfin/kernel/process.c +++ b/arch/blackfin/kernel/process.c @@ -39,12 +39,6 @@ int nr_l1stack_tasks; void *l1_stack_base; unsigned long l1_stack_len; -/* - * Powermanagement idle function, if any.. - */ -void (*pm_idle)(void) = NULL; -EXPORT_SYMBOL(pm_idle); - void (*pm_power_off)(void) = NULL; EXPORT_SYMBOL(pm_power_off); @@ -81,7 +75,6 @@ void cpu_idle(void) { /* endless idle loop with no priority at all */ while (1) { - void (*idle)(void) = pm_idle; #ifdef CONFIG_HOTPLUG_CPU if (cpu_is_offline(smp_processor_id())) -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/16] x86 idle: rename global pm_idle to static x86_idle
From: Len Brown (pm_idle)() is being removed from linux/pm.h because Linux does not have such a cross-architecture concept. x86 uses an idle function pointer in its architecture specific code as a backup to cpuidle. So we re-name x86 use of pm_idle to x86_idle, and make it static to x86. Signed-off-by: Len Brown Cc: x...@kernel.org --- arch/x86/kernel/process.c | 28 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index f571a6e..ceb05db 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -268,10 +268,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE; EXPORT_SYMBOL(boot_option_idle_override); -/* - * Powermanagement idle function, if any.. - */ -void (*pm_idle)(void); +static void (*x86_idle)(void); #ifndef CONFIG_SMP static inline void play_dead(void) @@ -348,7 +345,7 @@ void cpu_idle(void) rcu_idle_enter(); if (cpuidle_idle_call()) - pm_idle(); + x86_idle(); rcu_idle_exit(); start_critical_timings(); @@ -395,9 +392,9 @@ EXPORT_SYMBOL(default_idle); bool set_pm_idle_to_default(void) { - bool ret = !!pm_idle; + bool ret = !!x86_idle; - pm_idle = default_idle; + x86_idle = default_idle; return ret; } @@ -564,11 +561,10 @@ static void amd_e400_idle(void) void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c) { #ifdef CONFIG_SMP - if (pm_idle == poll_idle && smp_num_siblings > 1) { + if (x86_idle == poll_idle && smp_num_siblings > 1) pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n"); - } #endif - if (pm_idle) + if (x86_idle) return; if (cpu_has(c, X86_FEATURE_MWAIT) && mwait_usable(c)) { @@ -576,19 +572,19 @@ void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c) * One CPU supports mwait => All CPUs supports mwait */ pr_info("using mwait in idle threads\n"); - pm_idle = mwait_idle; + x86_idle = mwait_idle; } else if (cpu_has_amd_erratum(amd_erratum_400)) { /* E400: APIC timer interrupt does not wake up CPU from C1e */ pr_info("using AMD E400 aware idle routine\n"); - pm_idle = amd_e400_idle; + x86_idle = amd_e400_idle; } else - pm_idle = default_idle; + x86_idle = default_idle; } void __init init_amd_e400_c1e_mask(void) { /* If we're using amd_e400_idle, we need to allocate amd_e400_c1e_mask. */ - if (pm_idle == amd_e400_idle) + if (x86_idle == amd_e400_idle) zalloc_cpumask_var(&amd_e400_c1e_mask, GFP_KERNEL); } @@ -599,7 +595,7 @@ static int __init idle_setup(char *str) if (!strcmp(str, "poll")) { pr_info("using polling idle threads\n"); - pm_idle = poll_idle; + x86_idle = poll_idle; boot_option_idle_override = IDLE_POLL; } else if (!strcmp(str, "mwait")) { boot_option_idle_override = IDLE_FORCE_MWAIT; @@ -612,7 +608,7 @@ static int __init idle_setup(char *str) * To continue to load the CPU idle driver, don't touch * the boot_option_idle_override. */ - pm_idle = default_idle; + x86_idle = default_idle; boot_option_idle_override = IDLE_HALT; } else if (!strcmp(str, "nomwait")) { /* -- 1.8.1.3.535.ga923c31 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Two x86 idle cleanups
The following two patches remove cruft from the x86 native idle code. These two removals have been marked with a WARN() since Linux 3.0. Let me know if you see any troubles w/ these patches. TIP, These will conflict with another patch in my tree. So if you approve, please Ack and I can carry in my tree. thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] x86 idle: remove mwait_idle() and "idle=mwait" cmdline param
From: Len Brown mwait_idle() is a C1-only idle loop intended to be more efficient than HLT, starting on Pentium-4 HT-enabled processors. But mwait_idle() has been replaced by the more general mwait_idle_with_hints(), which handles both C1 and deeper C-states. ACPI processor_idle and intel_idle use only mwait_idle_with_hints(), and no longer use mwait_idle(). Here we simplify the x86 native idle code by removing mwait_idle(), and the "idle=mwait" bootparam used to invoke it. Since Linux 3.0 there has been a boot-time warning when "idle=mwait" was invoked saying it would be removed in 2012. This removal was also noted in the (now removed:-) feature-removal-schedule.txt. After this change, kernels configured with (CONFIG_ACPI=n && CONFIG_INTEL_IDLE=n) when run on hardware that supports MWAIT will simply use HLT. If MWAIT is desired on those systems, cpuidle and the cpuidle drivers above can be enabled. Signed-off-by: Len Brown Cc: x...@kernel.org --- Documentation/kernel-parameters.txt | 7 +--- arch/x86/include/asm/processor.h| 2 +- arch/x86/kernel/process.c | 79 + arch/x86/kernel/smpboot.c | 2 +- drivers/acpi/processor_idle.c | 1 - 5 files changed, 4 insertions(+), 87 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 363e348..3b0cd1e 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1039,16 +1039,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Claim all unknown PCI IDE storage controllers. idle= [X86] - Format: idle=poll, idle=mwait, idle=halt, idle=nomwait + Format: idle=poll, idle=halt, idle=nomwait Poll forces a polling idle loop that can slightly improve the performance of waking up a idle CPU, but will use a lot of power and make the system run hot. Not recommended. - idle=mwait: On systems which support MONITOR/MWAIT but - the kernel chose to not use it because it doesn't save - as much power as a normal idle loop, use the - MONITOR/MWAIT idle loop anyways. Performance should be - the same as idle=poll. idle=halt: Halt is forced to be used for CPU idle. In such case C2/C3 won't be used again. idle=nomwait: Disable mwait for CPU C-states diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index c2f7f47..8a28fea 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -725,7 +725,7 @@ extern unsigned long boot_option_idle_override; extern boolamd_e400_c1e_detected; enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_NOMWAIT, -IDLE_POLL, IDLE_FORCE_MWAIT}; +IDLE_POLL}; extern void enable_sep_cpu(void); extern int sysenter_setup(void); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 7ed9f6b0..cd5a4c9 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -421,27 +421,6 @@ void stop_this_cpu(void *dummy) } } -/* Default MONITOR/MWAIT with no hints, used for default C1 state */ -static void mwait_idle(void) -{ - if (!need_resched()) { - trace_power_start_rcuidle(POWER_CSTATE, 1, smp_processor_id()); - trace_cpu_idle_rcuidle(1, smp_processor_id()); - if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) - clflush((void *)¤t_thread_info()->flags); - - __monitor((void *)¤t_thread_info()->flags, 0, 0); - smp_mb(); - if (!need_resched()) - __sti_mwait(0, 0); - else - local_irq_enable(); - trace_power_end_rcuidle(smp_processor_id()); - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); - } else - local_irq_enable(); -} - /* * On SMP it's slightly faster (but much more power-consuming!) * to poll the ->work.need_resched flag instead of waiting for the @@ -458,53 +437,6 @@ static void poll_idle(void) trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); } -/* - * mwait selection logic: - * - * It depends on the CPU. For AMD CPUs that support MWAIT this is - * wrong. Family 0x10 and 0x11 CPUs will enter C1 on HLT. Powersavings - * then depend on a clock divisor and current Pstate of the core. If - * all cores of a processor are in halt state (C1) the processor can - * enter the C1E (C1 enhanced) state. If mwait is used this will never - * ha
[PATCH 2/2] x86 idle: remove 32-bit-only "no-hlt" parameter, hlt_works_ok flag
From: Len Brown Remove 32-bit x86 a cmdline param "no-hlt", and the cpuinfo_x86.hlt_works_ok that it sets. If a user wants to avoid HLT, then "idle=poll" is much more useful, as it avoids invocation of HLT in idle, while "no-hlt" failed to do so. Indeed, hlt_works_ok was consulted in only 2 places. First, in /proc/cpuinfo where "hlt_bug yes" would be printed if and only if the user booted the system with "no-hlt" -- as there was no other code to set that flag. Second, it was consulted in stop_this_cpu(), which is invoked by native_machine_halt()/reboot_interrupt()/smp_stop_nmi_callback() -- all cases where the machine is being shutdown/reset. The flag was not consulted in the more frequently invoked play_dead()/hlt_play_dead() used in processor offline and suspend. Since Linux-3.0 there has been a run-time notice upon "no-hlt" invocations indicating that it would be removed in 2012. Signed-off-by: Len Brown Cc: x...@kernel.org --- Documentation/kernel-parameters.txt | 4 arch/x86/include/asm/processor.h| 10 -- arch/x86/kernel/cpu/bugs.c | 9 - arch/x86/kernel/cpu/proc.c | 2 -- arch/x86/kernel/process.c | 6 ++ arch/x86/xen/setup.c| 3 --- 6 files changed, 2 insertions(+), 32 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 3b0cd1e..109ee45 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1881,10 +1881,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. wfi(ARM) instruction doesn't work correctly and not to use it. This is also useful when using JTAG debugger. - no-hlt [BUGS=X86-32] Tells the kernel that the hlt - instruction doesn't work correctly and not to - use it. - no_file_capsTells the kernel not to honor file capabilities. The only way then for a file to be executed with privilege is to be setuid root or executed by root. diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8a28fea..b9e7d27 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -89,7 +89,6 @@ struct cpuinfo_x86 { charwp_works_ok;/* It doesn't on 386's */ /* Problems on some 486Dx4's and old 386's: */ - charhlt_works_ok; charhard_math; charrfu; charfdiv_bug; @@ -165,15 +164,6 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info); extern const struct seq_operations cpuinfo_op; -static inline int hlt_works(int cpu) -{ -#ifdef CONFIG_X86_32 - return cpu_data(cpu).hlt_works_ok; -#else - return 1; -#endif -} - #define cache_line_size() (boot_cpu_data.x86_cache_alignment) extern void cpu_detect(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 92dfec9..2488be6 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -17,15 +17,6 @@ #include #include -static int __init no_halt(char *s) -{ - WARN_ONCE(1, "\"no-hlt\" is deprecated, please use \"idle=poll\"\n"); - boot_cpu_data.hlt_works_ok = 0; - return 1; -} - -__setup("no-hlt", no_halt); - static int __init no_387(char *s) { boot_cpu_data.hard_math = 0; diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c index 3286a92..e280253 100644 --- a/arch/x86/kernel/cpu/proc.c +++ b/arch/x86/kernel/cpu/proc.c @@ -28,7 +28,6 @@ static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c) { seq_printf(m, "fdiv_bug\t: %s\n" - "hlt_bug\t\t: %s\n" "f00f_bug\t: %s\n" "coma_bug\t: %s\n" "fpu\t\t: %s\n" @@ -36,7 +35,6 @@ static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c) "cpuid level\t: %d\n" "wp\t\t: %s\n", c->fdiv_bug ? "yes" : "no", - c->hlt_works_ok ? "no" : "yes", c->f00f_bug ? "yes" : "no", c->coma_bug ? "yes" : "no", c->hard_math ? "yes" : "no", diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index cd5a4c9..aef852e 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -415,10 +415,8 @@ void stop_this_cpu(void *dummy) set_cpu_online
[PATCH 2/2 v2] x86 idle: remove 32-bit-only "no-hlt" parameter, hlt_works_ok flag
From: Len Brown Remove 32-bit x86 a cmdline param "no-hlt", and the cpuinfo_x86.hlt_works_ok that it sets. If a user wants to avoid HLT, then "idle=poll" is much more useful, as it avoids invocation of HLT in idle, while "no-hlt" failed to do so. Indeed, hlt_works_ok was consulted in only 3 places. First, in /proc/cpuinfo where "hlt_bug yes" would be printed if and only if the user booted the system with "no-hlt" -- as there was no other code to set that flag. Second, check_hlt() would not invoke halt() if "no-hlt" were on the cmdline. Third, it was consulted in stop_this_cpu(), which is invoked by native_machine_halt()/reboot_interrupt()/smp_stop_nmi_callback() -- all cases where the machine is being shutdown/reset. The flag was not consulted in the more frequently invoked play_dead()/hlt_play_dead() used in processor offline and suspend. Since Linux-3.0 there has been a run-time notice upon "no-hlt" invocations indicating that it would be removed in 2012. Signed-off-by: Len Brown Cc: x...@kernel.org --- v2: remove also check_hlt() --- Documentation/kernel-parameters.txt | 4 arch/x86/include/asm/processor.h| 10 -- arch/x86/kernel/cpu/bugs.c | 27 --- arch/x86/kernel/cpu/proc.c | 2 -- arch/x86/kernel/process.c | 6 ++ arch/x86/xen/setup.c| 3 --- 6 files changed, 2 insertions(+), 50 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 3b0cd1e..109ee45 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1881,10 +1881,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. wfi(ARM) instruction doesn't work correctly and not to use it. This is also useful when using JTAG debugger. - no-hlt [BUGS=X86-32] Tells the kernel that the hlt - instruction doesn't work correctly and not to - use it. - no_file_capsTells the kernel not to honor file capabilities. The only way then for a file to be executed with privilege is to be setuid root or executed by root. diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8a28fea..b9e7d27 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -89,7 +89,6 @@ struct cpuinfo_x86 { charwp_works_ok;/* It doesn't on 386's */ /* Problems on some 486Dx4's and old 386's: */ - charhlt_works_ok; charhard_math; charrfu; charfdiv_bug; @@ -165,15 +164,6 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct cpuinfo_x86, cpu_info); extern const struct seq_operations cpuinfo_op; -static inline int hlt_works(int cpu) -{ -#ifdef CONFIG_X86_32 - return cpu_data(cpu).hlt_works_ok; -#else - return 1; -#endif -} - #define cache_line_size() (boot_cpu_data.x86_cache_alignment) extern void cpu_detect(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 92dfec9..af6455e 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -17,15 +17,6 @@ #include #include -static int __init no_halt(char *s) -{ - WARN_ONCE(1, "\"no-hlt\" is deprecated, please use \"idle=poll\"\n"); - boot_cpu_data.hlt_works_ok = 0; - return 1; -} - -__setup("no-hlt", no_halt); - static int __init no_387(char *s) { boot_cpu_data.hard_math = 0; @@ -89,23 +80,6 @@ static void __init check_fpu(void) pr_warn("Hmm, FPU with FDIV bug\n"); } -static void __init check_hlt(void) -{ - if (boot_cpu_data.x86 >= 5 || paravirt_enabled()) - return; - - pr_info("Checking 'hlt' instruction... "); - if (!boot_cpu_data.hlt_works_ok) { - pr_cont("disabled\n"); - return; - } - halt(); - halt(); - halt(); - halt(); - pr_cont("OK\n"); -} - /* * Check whether we are able to run this kernel safely on SMP. * @@ -129,7 +103,6 @@ void __init check_bugs(void) print_cpu_info(&boot_cpu_data); #endif check_config(); - check_hlt(); init_utsname()->machine[1] = '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86); alternative_instructions(); diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c index 3286a92..e280253 100644 --- a/arch/x86/kernel/cpu/proc.c +++ b/arch/x86/kernel/cpu/proc.c @@ -28,7 +28,6 @@ static void show_cpuinfo_misc(struct
Re: [PATCH 06/16] ARM idle: delete pm_idle
On 02/11/2013 11:11 AM, Russell King - ARM Linux wrote: > On Mon, Feb 11, 2013 at 04:02:30PM +, Catalin Marinas wrote: >> On Sun, Feb 10, 2013 at 05:58:13AM +0000, Len Brown wrote: >>> pm_idle() on ARM was a synonym for default_idle(), >>> so simply invoke default_idle() directly. >> >> The clean-up looks fine as we already have an arm_pm_idle but longer >> term I was thinking about having a common declaration similar to >> pm_power_off that code under drivers/power/(reset/) can override (and >> such driver may be shared by multiple architectures). OTOH, if you get >> rid of the generic linux/pm.h declaration architectures can use a common >> pm_idle name and type (though I think having it in the common header >> would be better). For ARM this would mean s/arm_pm_idle/pm_idle/ on top >> if your patch. > > pm_idle() was that common declaration - but it had the side effect that > it was defined to be called with interrupts disabled, but return with > interrupts enabled. Yeah, I expect that grew out of the x86 idiom "STI;HLT", which dictated default_idle(), which in-turn dictated pm_idle(). x86's MWAIT instruction now has more flexibility WRT interrupts, but the cast was already set... Please let me know if you Ack the patch as is, or would like it changed. thanks, -Len Brown, Intel Open Source Technology Center > arm_pm_idle() "fixed" that weirdness such that it's now expected to > return with IRQs in the same state that it was called. > > pm_power_off() is a cross-arch hook already. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/16] APM idle: register apm_cpu_idle via cpuidle
On 02/11/2013 04:18 AM, Daniel Lezcano wrote: > On 02/10/2013 06:58 AM, Len Brown wrote: >> From: Len Brown >> >> Update APM to register its local idle routine with cpuidle. >> >> This allows us to stop exporting pm_idle to modules on x86. >> >> The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE. >> >> Compile-tested only. >> >> Signed-off-by: Len Brown >> Cc: Jiri Kosina >> --- > >> +static struct cpuidle_device apm_cpuidle_device = { >> +.state_count = 2, >> +}; > > This will make the cpuidle_register_driver to fail because the > state_count field of the cpuidle_driver is zero, no ? > > static struct cpuidle_driver apm_idle_driver = { > .name = "apm_idle", > ... > .states = { > ... > }, > .state_count = 2, > } yup, good catch. >> -static void apm_cpu_idle(void) >> +static int apm_cpu_idle(struct cpuidle_device *dev, >> +struct cpuidle_driver *drv, int index) >> { >> static int use_apm_idle; /* = 0 */ >> static unsigned int last_jiffies; /* = 0 */ ... >> @@ -964,6 +983,7 @@ recalc: >> apm_do_busy(); >> >> local_irq_enable(); > > > > As the 'en_core_tk_irqen' flag has been enabled in the driver, the > caller of this function will call 'local_irq_enable'. We can remove the > this line no ? yup, this line is no longer necessary. v2 on the way -- hoping somebody with an APM box can test it. thanks, -Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
APM: Request to Test
Thanks, Daniel, for reviewing the patch -- never would have worked... Here is v2. Is there anybody out there with an APM box that can run an upstream kernel with this patch in it? thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] APM idle: register apm_cpu_idle via cpuidle
From: Len Brown Update APM to register its local idle routine with cpuidle. This allows us to stop exporting pm_idle to modules on x86. The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE. Compile-tested only. Signed-off-by: Len Brown Cc: Jiri Kosina --- v2: updates from Daniel Lezcano's review --- arch/x86/Kconfig | 1 + arch/x86/kernel/apm_32.c | 60 +-- arch/x86/kernel/process.c | 3 --- 3 files changed, 38 insertions(+), 26 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 225543b..1b63586 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1912,6 +1912,7 @@ config APM_DO_ENABLE this feature. config APM_CPU_IDLE + depends on CPU_IDLE bool "Make CPU Idle calls when idle" ---help--- Enable calls to APM CPU Idle/CPU Busy inside the kernel's idle loop. diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c index d65464e..f9f8afc 100644 --- a/arch/x86/kernel/apm_32.c +++ b/arch/x86/kernel/apm_32.c @@ -232,6 +232,7 @@ #include #include #include +#include #include #include @@ -360,13 +361,38 @@ struct apm_user { * idle percentage above which bios idle calls are done */ #ifdef CONFIG_APM_CPU_IDLE -#warning deprecated CONFIG_APM_CPU_IDLE will be deleted in 2012 #define DEFAULT_IDLE_THRESHOLD 95 #else #define DEFAULT_IDLE_THRESHOLD 100 #endif #define DEFAULT_IDLE_PERIOD(100 / 3) +static int apm_cpu_idle(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index); + +static struct cpuidle_driver apm_idle_driver = { + .name = "apm_idle", + .owner = THIS_MODULE, + .en_core_tk_irqen = 1, + .states = { + { /* entry 0 is for polling */ }, + { /* entry 1 is for APM idle */ + .name = "APM", + .desc = "APM idle", + .flags = CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 250,/* WAG */ + .target_residency = 500,/* WAG */ + .enter = &apm_cpu_idle + }, + }, + .state_count = 2, +}; + +static struct cpuidle_device apm_cpuidle_device = { + .state_count = 2, +}; + + /* * Local variables */ @@ -377,7 +403,6 @@ static struct { static int clock_slowed; static int idle_threshold __read_mostly = DEFAULT_IDLE_THRESHOLD; static int idle_period __read_mostly = DEFAULT_IDLE_PERIOD; -static int set_pm_idle; static int suspends_pending; static int standbys_pending; static int ignore_sys_suspend; @@ -884,8 +909,6 @@ static void apm_do_busy(void) #define IDLE_CALC_LIMIT(HZ * 100) #define IDLE_LEAKY_MAX 16 -static void (*original_pm_idle)(void) __read_mostly; - /** * apm_cpu_idle- cpu idling for APM capable Linux * @@ -894,7 +917,8 @@ static void (*original_pm_idle)(void) __read_mostly; * Furthermore it calls the system default idle routine. */ -static void apm_cpu_idle(void) +static int apm_cpu_idle(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) { static int use_apm_idle; /* = 0 */ static unsigned int last_jiffies; /* = 0 */ @@ -904,7 +928,6 @@ static void apm_cpu_idle(void) unsigned int jiffies_since_last_check = jiffies - last_jiffies; unsigned int bucket; - WARN_ONCE(1, "deprecated apm_cpu_idle will be deleted in 2012"); recalc: if (jiffies_since_last_check > IDLE_CALC_LIMIT) { use_apm_idle = 0; @@ -950,10 +973,7 @@ recalc: break; } } - if (original_pm_idle) - original_pm_idle(); - else - default_idle(); + default_idle(); local_irq_disable(); jiffies_since_last_check = jiffies - last_jiffies; if (jiffies_since_last_check > idle_period) @@ -963,7 +983,7 @@ recalc: if (apm_idle_done) apm_do_busy(); - local_irq_enable(); + return index; } /** @@ -2381,9 +2401,9 @@ static int __init apm_init(void) if (HZ != 100) idle_period = (idle_period * HZ) / 100; if (idle_threshold < 100) { - original_pm_idle = pm_idle; - pm_idle = apm_cpu_idle; - set_pm_idle = 1; + if (!cpuidle_register_driver(&apm_idle_driver)) + if (cpuidle_register_device(&apm_cpuidle_device)) + cpuidle_unregister_driver(&apm_idle_driver); } return 0; @@ -2393,15 +2413,9 @@ static void __exit apm_exit(void) { int error; - if (set_pm_idle) { - pm_idle = original_pm_idle; -
Re: linux-next: build failure after merge of the final tree (acpi tree related)
On 02/11/2013 02:34 AM, Stephen Rothwell wrote: > Hi all, > > After merging the final tree, today's linux-next build (sparc64 defconfig) > failed like this: > > arch/sparc/include/asm/processor.h: Assembler messages: > arch/sparc/include/asm/processor.h:10: Error: Unknown opcode: `extern' > > Caused by commit 3a242f58a5f4 ("sparc idle: rename pm_idle to > sparc_idle") from the acpi tree. > > I have applied this patch for today: > > From: Stephen Rothwell > Date: Mon, 11 Feb 2013 18:30:19 +1100 > Subject: [PATCH] sparc idle: protect variable declarations against the > assembler > > Signed-off-by: Stephen Rothwell > --- > arch/sparc/include/asm/processor.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/sparc/include/asm/processor.h > b/arch/sparc/include/asm/processor.h > index 34baa35..622cfa5 100644 > --- a/arch/sparc/include/asm/processor.h > +++ b/arch/sparc/include/asm/processor.h > @@ -7,6 +7,8 @@ > #endif > > #define nop()__asm__ __volatile__ ("nop") > +#ifndef __ASSEMBLY__ > extern void (*sparc_idle)(void); > +#endif > > #endif > Thank you Stephen! The last time I compiled a sparc kernel was in 1993:-) I've added your fix and Dave's Ack to this patch, and updated it in my next branch. Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the acpi tree with the pm tree
On 02/10/2013 08:58 PM, Stephen Rothwell wrote: > Hi Len, > > Today's linux-next merge of the acpi tree got a conflict in > drivers/acpi/processor_idle.c between commit 4f8429166818 ("ACPICA: > Cleanup PM_TIMER_FREQUENCY definition") from the pm tree and commit > 41cdb0efc42e ("ACPI / idle: remove unused definition") from the acpi tree. > > I fixed it up (The latter removed the lines modified by the former, so I > did that) and can carry the fix as necessary (no action is required). > Thank you Stephen, Go ahead and keep these two merge fix-ups. We'll either merge the rjw and the lenb trees before sending to Linus, or let him repeat your merge fix-ups as he likes to do. BTW. Rafael's "pm" tree now carries the ACPI patch stream, so it is probably a mis-representation to call my tree the "acpi" tree. My tree is primarily focused on the "idle" part of pm these days. thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 14/16] intel_idle: remove use and definition of MWAIT_MAX_NUM_CSTATES
On 02/11/2013 03:53 AM, Daniel Lezcano wrote: > On 02/09/2013 02:08 AM, Len Brown wrote: >> The reason to change is that intel_idle will soon be able >> to export more than the 8 "major" states supported by MWAIT. >> When we hit that limit, it is important to know >> where the limit comes from. > > Does it mean CPUIDLE_STATE_MAX may increase in a near future ? Yes, perhaps to 10. Let me know if you anticipate issues with doing that. thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Should SPARC use cpuidle? (was: linux-next: build failure after merge of the final tree (acpi tree related))
On 02/12/2013 12:35 PM, Sam Ravnborg wrote: >>> Signed-off-by: Stephen Rothwell >>> --- >>> arch/sparc/include/asm/processor.h | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/arch/sparc/include/asm/processor.h >>> b/arch/sparc/include/asm/processor.h >>> index 34baa35..622cfa5 100644 >>> --- a/arch/sparc/include/asm/processor.h >>> +++ b/arch/sparc/include/asm/processor.h >>> @@ -7,6 +7,8 @@ >>> #endif >>> >>> #define nop() __asm__ __volatile__ ("nop") >>> +#ifndef __ASSEMBLY__ >>> extern void (*sparc_idle)(void); >>> +#endif >>> >>> #endif >>> >> >> Thank you Stephen! >> >> The last time I compiled a sparc kernel was in 1993:-) >> >> I've added your fix and Dave's Ack to this patch, >> and updated it in my next branch. > > Hi Len. > > Can you please move the definition of sparc_idle to processor_32.h > It is sparc32 specific - and then we do not need the __ASSEMBLY__ guards > as the sparc32 variant are not used from assembler. sure, let me know if attached works. > Do you btw. have any hints how I can convert to the cpu_idle thing you hinted? If you have exactly 1 idle state, then cpuidle isn't that interesting, except, perhaps the standard residency counters. If you have multiple states to choose from, cpuidle becomes more valuable. There are lots of cpuidle users now, including x86's intel_idle, processor_idle, and the entire ARM tree. In my tree right now is a patch to convert APM to cpuidle -- though as nobody has tested it yet I can't guarantee it is correct. patches/issues related to idle should to to linux...@vger.kernel.org (on cc) thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Should SPARC use cpuidle?
>> Can you please move the definition of sparc_idle to processor_32.h >> It is sparc32 specific - and then we do not need the __ASSEMBLY__ guards >> as the sparc32 variant are not used from assembler. > > sure, let me know if attached works. ugh, not accustomed to sending patches via thunderbird. hopefully this attachment works... >From 358ca5d7e02c4559ad3fbf8135421e4a3753e979 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Sat, 9 Feb 2013 23:27:26 -0500 Subject: [PATCH] sparc idle: rename pm_idle to sparc_idle Reply-To: Len Brown Organization: Intel Open Source Technology Center (pm_idle)() is being removed from linux/pm.h because Linux does not have such a cross-architecture concept. sparc uses an idle function pointer in its architecture specific code. So we re-name sparc use of pm_idle to sparc_idle. Maybe some day, SPARC will cut over to cpuidle... Signed-off-by: Len Brown Acked-by: David S. Miller --- arch/sparc/include/asm/processor_32.h | 1 + arch/sparc/kernel/apc.c | 3 ++- arch/sparc/kernel/leon_pmc.c | 5 +++-- arch/sparc/kernel/pmc.c | 3 ++- arch/sparc/kernel/process_32.c| 7 +++ 5 files changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h index c1e0191..2c7baa4 100644 --- a/arch/sparc/include/asm/processor_32.h +++ b/arch/sparc/include/asm/processor_32.h @@ -118,6 +118,7 @@ extern unsigned long get_wchan(struct task_struct *); extern struct task_struct *last_task_used_math; #define cpu_relax() barrier() +extern void (*sparc_idle)(void); #endif diff --git a/arch/sparc/kernel/apc.c b/arch/sparc/kernel/apc.c index 348fa1a..eefda32 100644 --- a/arch/sparc/kernel/apc.c +++ b/arch/sparc/kernel/apc.c @@ -20,6 +20,7 @@ #include #include #include +#include /* Debugging * @@ -158,7 +159,7 @@ static int apc_probe(struct platform_device *op) /* Assign power management IDLE handler */ if (!apc_no_idle) - pm_idle = apc_swift_idle; + sparc_idle = apc_swift_idle; printk(KERN_INFO "%s: power management initialized%s\n", APC_DEVNAME, apc_no_idle ? " (CPU idle disabled)" : ""); diff --git a/arch/sparc/kernel/leon_pmc.c b/arch/sparc/kernel/leon_pmc.c index 4e17432..708bca4 100644 --- a/arch/sparc/kernel/leon_pmc.c +++ b/arch/sparc/kernel/leon_pmc.c @@ -9,6 +9,7 @@ #include #include #include +#include /* List of Systems that need fixup instructions around power-down instruction */ unsigned int pmc_leon_fixup_ids[] = { @@ -69,9 +70,9 @@ static int __init leon_pmc_install(void) if (sparc_cpu_model == sparc_leon) { /* Assign power management IDLE handler */ if (pmc_leon_need_fixup()) - pm_idle = pmc_leon_idle_fixup; + sparc_idle = pmc_leon_idle_fixup; else - pm_idle = pmc_leon_idle; + sparc_idle = pmc_leon_idle; printk(KERN_INFO "leon: power management initialized\n"); } diff --git a/arch/sparc/kernel/pmc.c b/arch/sparc/kernel/pmc.c index dcbb62f..8b7297f 100644 --- a/arch/sparc/kernel/pmc.c +++ b/arch/sparc/kernel/pmc.c @@ -17,6 +17,7 @@ #include #include #include +#include /* Debug * @@ -63,7 +64,7 @@ static int pmc_probe(struct platform_device *op) #ifndef PMC_NO_IDLE /* Assign power management IDLE handler */ - pm_idle = pmc_swift_idle; + sparc_idle = pmc_swift_idle; #endif printk(KERN_INFO "%s: power management initialized\n", PMC_DEVNAME); diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c index be8e862..62eede1 100644 --- a/arch/sparc/kernel/process_32.c +++ b/arch/sparc/kernel/process_32.c @@ -43,8 +43,7 @@ * Power management idle function * Set in pm platform drivers (apc.c and pmc.c) */ -void (*pm_idle)(void); -EXPORT_SYMBOL(pm_idle); +void (*sparc_idle)(void); /* * Power-off handler instantiation for pm.h compliance @@ -75,8 +74,8 @@ void cpu_idle(void) /* endless idle loop with no priority at all */ for (;;) { while (!need_resched()) { - if (pm_idle) -(*pm_idle)(); + if (sparc_idle) +(*sparc_idle)(); else cpu_relax(); } -- 1.8.1.3.535.ga923c31
[PATCH 1/4] intel_idle: stop using driver_data for static flags
From: Len Brown The commit, 4202735e8ab6ecfb0381631a0d0b58fefe0bd4e2 (cpuidle: Split cpuidle_state structure and move per-cpu statistics fields) observed that the MWAIT flags for Cn on every processor to date were the same, and created get_driver_data() to supply them. Unfortunately, that assumption is false, going forward. So here we restore the MWAIT flags to the cpuidle_state table. However, instead restoring the old "driver_data" field, we put the flags into the existing "flags" field, where they probalby should have lived all along. This patch does not change any operation. This patch removes 1 of the 3 users of cpuidle_state_usage.driver_data. Perhaps some day we'll get rid of the other 2. Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 75 --- 1 file changed, 26 insertions(+), 49 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 2df9414..8331753 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -109,6 +109,16 @@ static struct cpuidle_state *cpuidle_state_table; #define CPUIDLE_FLAG_TLB_FLUSHED 0x1 /* + * MWAIT takes an 8-bit "hint" in EAX "suggesting" + * the C-state (top nibble) and sub-state (bottom nibble) + * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc. + * + * We store the hint at the top of our "flags" for each state. + */ +#define flags_2_MWAIT_EAX(flags) (((flags) >> 24) & 0xFF) +#define MWAIT_EAX_2_flags(eax) ((eax & 0xFF) << 24) + +/* * States are indexed by the cstate number, * which is also the index into the MWAIT hint array. * Thus C0 is a dummy. @@ -118,21 +128,21 @@ static struct cpuidle_state nehalem_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C1 */ .name = "C1-NHM", .desc = "MWAIT 0x00", - .flags = CPUIDLE_FLAG_TIME_VALID, + .flags = MWAIT_EAX_2_flags(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 3, .target_residency = 6, .enter = &intel_idle }, { /* MWAIT C2 */ .name = "C3-NHM", .desc = "MWAIT 0x10", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT_EAX_2_flags(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 20, .target_residency = 80, .enter = &intel_idle }, { /* MWAIT C3 */ .name = "C6-NHM", .desc = "MWAIT 0x20", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT_EAX_2_flags(0x20) |CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 200, .target_residency = 800, .enter = &intel_idle }, @@ -143,28 +153,28 @@ static struct cpuidle_state snb_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C1 */ .name = "C1-SNB", .desc = "MWAIT 0x00", - .flags = CPUIDLE_FLAG_TIME_VALID, + .flags = MWAIT_EAX_2_flags(0x00) | CPUIDLE_FLAG_TIME_VALID, .exit_latency = 1, .target_residency = 1, .enter = &intel_idle }, { /* MWAIT C2 */ .name = "C3-SNB", .desc = "MWAIT 0x10", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT_EAX_2_flags(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 80, .target_residency = 211, .enter = &intel_idle }, { /* MWAIT C3 */ .name = "C6-SNB", .desc = "MWAIT 0x20", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT_EAX_2_flags(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 104, .target_residency = 345, .enter = &intel_idle }, { /* MWAIT C4 */ .name = "C7-SNB", .desc = "MWAIT 0x30", - .flags = CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT_EAX_2_flags(0x30) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 109, .target_residency = 345, .enter = &intel_idle }, @@ -175,28 +185,28 @@ static struct cpuidle_state ivb_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C1 */ .name = "C1-IVB", .desc = "MWAIT 0x00", - .flags = CPUIDLE_FLAG_TIME_VALID, +
[PATCH 4/4] intel_idle: initial HSW support
From: Len Brown This patch enables intel_idle to run on the next-generation Intel(R) Microarchitecture code named Haswell. Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 8331753..85d50a1 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -212,6 +212,38 @@ static struct cpuidle_state ivb_cstates[MWAIT_MAX_NUM_CSTATES] = { .enter = &intel_idle }, }; +static struct cpuidle_state hsw_cstates[MWAIT_MAX_NUM_CSTATES] = { + { /* MWAIT C0 */ }, + { /* MWAIT C1 */ + .name = "C1-HSW", + .desc = "MWAIT 0x00", + .flags = MWAIT_EAX_2_flags(0x00) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 2, + .target_residency = 2, + .enter = &intel_idle }, + { /* MWAIT C2 */ + .name = "C3-HSW", + .desc = "MWAIT 0x10", + .flags = MWAIT_EAX_2_flags(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 39, + .target_residency = 117, + .enter = &intel_idle }, + { /* MWAIT C3 */ + .name = "C6-HSW", + .desc = "MWAIT 0x20", + .flags = MWAIT_EAX_2_flags(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 55, + .target_residency = 165, + .enter = &intel_idle }, + { /* MWAIT C4 */ + .name = "C7-HSW", + .desc = "MWAIT 0x30", + .flags = MWAIT_EAX_2_flags(0x32) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 79, + .target_residency = 237, + .enter = &intel_idle }, +}; + static struct cpuidle_state atom_cstates[MWAIT_MAX_NUM_CSTATES] = { { /* MWAIT C0 */ }, { /* MWAIT C1 */ @@ -365,6 +397,10 @@ static const struct idle_cpu idle_cpu_ivb = { .state_table = ivb_cstates, }; +static const struct idle_cpu idle_cpu_hsw = { + .state_table = hsw_cstates, +}; + #define ICPU(model, cpu) \ { X86_VENDOR_INTEL, 6, model, X86_FEATURE_MWAIT, (unsigned long)&cpu } @@ -382,6 +418,9 @@ static const struct x86_cpu_id intel_idle_ids[] = { ICPU(0x2d, idle_cpu_snb), ICPU(0x3a, idle_cpu_ivb), ICPU(0x3e, idle_cpu_ivb), + ICPU(0x3c, idle_cpu_hsw), + ICPU(0x3f, idle_cpu_hsw), + ICPU(0x45, idle_cpu_hsw), {} }; MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids); -- 1.8.1.2.422.g08c0e7f -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
intel_idle and turbostat patches supporting Haswell
Here are some pathces I have queued in my tree for the next release to support Haswell. Please let me know if you see issues with any of them. thanks! -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] tools/power turbostat: decode MSR_IA32_POWER_CTL
From: Len Brown When verbose is enabled, print the C1E-Enable bit in MSR_IA32_POWER_CTL. also delete some redundant tests on the verbose variable. Signed-off-by: Len Brown --- arch/x86/include/uapi/asm/msr-index.h | 2 ++ tools/power/x86/turbostat/turbostat.c | 13 +++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h index 433a59f..7bdaf7c 100644 --- a/arch/x86/include/uapi/asm/msr-index.h +++ b/arch/x86/include/uapi/asm/msr-index.h @@ -103,6 +103,8 @@ #define DEBUGCTLMSR_BTS_OFF_USR(1UL << 10) #define DEBUGCTLMSR_FREEZE_LBRS_ON_PMI (1UL << 11) +#define MSR_IA32_POWER_CTL 0x01fc + #define MSR_IA32_MC0_CTL 0x0400 #define MSR_IA32_MC0_STATUS0x0401 #define MSR_IA32_MC0_ADDR 0x0402 diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index b326878..75f64e0 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -908,8 +908,7 @@ void print_verbose_header(void) get_msr(0, MSR_NHM_PLATFORM_INFO, &msr); - if (verbose) - fprintf(stderr, "cpu0: MSR_NHM_PLATFORM_INFO: 0x%08llx\n", msr); + fprintf(stderr, "cpu0: MSR_NHM_PLATFORM_INFO: 0x%08llx\n", msr); ratio = (msr >> 40) & 0xFF; fprintf(stderr, "%d * %.0f = %.0f MHz max efficiency\n", @@ -919,13 +918,16 @@ void print_verbose_header(void) fprintf(stderr, "%d * %.0f = %.0f MHz TSC frequency\n", ratio, bclk, ratio * bclk); + get_msr(0, MSR_IA32_POWER_CTL, &msr); + fprintf(stderr, "cpu0: MSR_IA32_POWER_CTL: 0x%08llx (C1E: %sabled)\n", + msr, msr & 0x2 ? "EN" : "DIS"); + if (!do_ivt_turbo_ratio_limit) goto print_nhm_turbo_ratio_limits; get_msr(0, MSR_IVT_TURBO_RATIO_LIMIT, &msr); - if (verbose) - fprintf(stderr, "cpu0: MSR_IVT_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); + fprintf(stderr, "cpu0: MSR_IVT_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); ratio = (msr >> 56) & 0xFF; if (ratio) @@ -1016,8 +1018,7 @@ print_nhm_turbo_ratio_limits: get_msr(0, MSR_NHM_TURBO_RATIO_LIMIT, &msr); - if (verbose) - fprintf(stderr, "cpu0: MSR_NHM_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); + fprintf(stderr, "cpu0: MSR_NHM_TURBO_RATIO_LIMIT: 0x%08llx\n", msr); ratio = (msr >> 56) & 0xFF; if (ratio) -- 1.8.1.2.422.g08c0e7f -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] tools/power turbostat: support HSW
From: Len Brown This patch enables turbostat to run properly on the next-generation Intel(R) Microarchitecture, code named Haswell (HSW). HSW supports the BCLK and counters found in SNB. Signed-off-by: Len Brown --- tools/power/x86/turbostat/turbostat.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index ce6d460..b326878 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -1397,6 +1397,9 @@ int has_nehalem_turbo_ratio_limit(unsigned int family, unsigned int model) case 0x2D: /* SNB Xeon */ case 0x3A: /* IVB */ case 0x3E: /* IVB Xeon */ + case 0x3C: /* HSW */ + case 0x3F: /* HSW */ + case 0x45: /* HSW */ return 1; case 0x2E: /* Nehalem-EX Xeon - Beckton */ case 0x2F: /* Westmere-EX Xeon - Eagleton */ @@ -1488,6 +1491,9 @@ void rapl_probe(unsigned int family, unsigned int model) switch (model) { case 0x2A: case 0x3A: + case 0x3C: /* HSW */ + case 0x3F: /* HSW */ + case 0x45: /* HSW */ do_rapl = RAPL_PKG | RAPL_CORES | RAPL_GFX; break; case 0x2D: @@ -1724,6 +1730,9 @@ int is_snb(unsigned int family, unsigned int model) case 0x2D: case 0x3A: /* IVB */ case 0x3E: /* IVB Xeon */ + case 0x3C: /* HSW */ + case 0x3F: /* HSW */ + case 0x45: /* HSW */ return 1; } return 0; @@ -2248,7 +2257,7 @@ int main(int argc, char **argv) cmdline(argc, argv); if (verbose) - fprintf(stderr, "turbostat v3.0 November 23, 2012" + fprintf(stderr, "turbostat v3.1 January 8, 2013" " - Len Brown \n"); turbostat_init(); -- 1.8.1.2.422.g08c0e7f -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] intel_idle: stop using driver_data for static flags
On 02/01/2013 03:44 AM, Daniel Lezcano wrote: > On 02/01/2013 05:11 AM, Len Brown wrote: >> From: Len Brown >> >> The commit, 4202735e8ab6ecfb0381631a0d0b58fefe0bd4e2 >> (cpuidle: Split cpuidle_state structure and move per-cpu statistics fields) >> observed that the MWAIT flags for Cn on every processor to date were the >> same, and created get_driver_data() to supply them. >> >> Unfortunately, that assumption is false, going forward. >> So here we restore the MWAIT flags to the cpuidle_state table. >> However, instead restoring the old "driver_data" field, >> we put the flags into the existing "flags" field, >> where they probalby should have lived all along. > > Removing the driver_data is a good thing but I am not sure it is the > case by moving the MWAIT flags to the cpuidle_state's flags field. We > are mixing arch specific flags with a generic code. > > This is prone to errors because new flags could appear for the cpuidle > core code and could collide with the arch specific flags. > > Wouldn't make sense to add a private field in the struct cpuidle_state > structure to let the driver/arch specific to store whatever is needed ? > > struct cpuidle_state { > > ... > unsigned long private; > ... > > } The top half of the flags are reserved for the driver, as noted by the definition of CPUIDLE_DRIVER_FLAGS_MASK with the generic flag definitions: #define CPUIDLE_FLAG_TIME_VALID (0x01) /* is residency time measurable? */ #define CPUIDLE_FLAG_COUPLED(0x02) /* state applies to multiple cpus */ #define CPUIDLE_DRIVER_FLAGS_MASK (0x) intel_idle already uses a driver-specific flag: #define CPUIDLE_FLAG_TLB_FLUSHED0x1 This patch just uses 4 more bits along with that one. Sure, if we run out of space, we can add an additional field. But I don't see an immediate need for it. thanks, Len Brown Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] intel_idle: stop using driver_data for static flags
>> intel_idle already uses a driver-specific flag: >> >> #define CPUIDLE_FLAG_TLB_FLUSHED0x1 >> >> This patch just uses 4 more bits along with that one. > > Ok. In this case it would make sense to move this flag out of the > generic core code to the intel_idle.c file ? This flag is already local to intel_idle.c. If another architecture finds it useful, then yes, it would make sense to move it to the shared header. > and move the [dec/en]coding > macro flags_2_MWAIT_EAX and MWAIT_EAX_2_flags (with a more appropriate > name for a generic code) to the cpuidle.h file ? I think that a driver's private flag definitions should remain local to the driver. It makes no sense to pollute the name space of other drivers for stuff that doesn't mean anything to them. MWAIT is pretty specific to x86 -- and re-naming it to something 'generic' isn't going to make the code easier to read. thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] APM idle: register apm_cpu_idle via cpuidle
On 02/18/2013 08:30 AM, Jiri Kosina wrote: > On Mon, 11 Feb 2013, Len Brown wrote: > >> From: Len Brown >> >> Update APM to register its local idle routine with cpuidle. >> >> This allows us to stop exporting pm_idle to modules on x86. >> >> The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE. >> >> Compile-tested only. > > I will test it on APM hardware and report back to you. > > Do you then want me to take it through my tree, or are you going to push > the whole patchset as a whole? Rafael took my series as a whole into his PM tree and it is currently staged for a 3.9 pull. But if I broke your APM box, I'm standing by ready to fix it during the release. Please get back to me with the test results: dmesg | grep idle grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* would tell us if it is working or not. thanks! -Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 05/16] blackfin idle: delete pm_idle
> Otherwise the patch seems to work fine. Thanks for testing Lars! Sorry about breaking the build w/ my slopyness -- will get that patched up. cheers, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 01/37] Hibernation: Introduce SNAPSHOT_GET_IMAGE_SIZE ioctl
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Add a new ioctl, SNAPSHOT_GET_IMAGE_SIZE, returning the size of the (just created) hibernation image, to the hibernation userland interface. This ioctl is necessary so that the userland utilities using the interface need not access the hibernation image header, owned by the kernel, in order to obtain the size of the image. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/userland-swsusp.txt | 12 +--- kernel/power/power.h|4 +++- kernel/power/snapshot.c |7 ++- kernel/power/user.c | 18 ++ 4 files changed, 28 insertions(+), 13 deletions(-) diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index e00c6cf..32f1874 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -54,6 +54,8 @@ SNAPSHOT_SET_IMAGE_SIZE - set the preferred maximum size of the image this number, but if it turns out to be impossible, the kernel will create the smallest image possible) +SNAPSHOT_GET_IMAGE_SIZE - return the actual size of the hibernation image + SNAPSHOT_AVAIL_SWAP - return the amount of available swap in bytes (the last argument should be a pointer to an unsigned int variable that will contain the result if the call is successful). @@ -136,13 +138,9 @@ required, as they can use, for example, a special (blank) suspend partition or a file on a partition that is unmounted before SNAPSHOT_ATOMIC_SNAPSHOT and mounted afterwards. -These utilities SHOULD NOT make any assumptions regarding the ordering of -data within the snapshot image, except for the image header that MAY be -assumed to start with an swsusp_info structure, as specified in -kernel/power/power.h. This structure MAY be used by the userland utilities -to obtain some information about the snapshot image, such as the size -of the snapshot image, including the metadata and the header itself, -contained in the .size member of swsusp_info. +These utilities MUST NOT make any assumptions regarding the ordering of +data within the snapshot image. The contents of the image are entirely owned +by the kernel and its structure may be changed in future kernel releases. The snapshot image MUST be written to the kernel unaltered (ie. all of the image data, metadata and header MUST be written in _exactly_ the same amount, form diff --git a/kernel/power/power.h b/kernel/power/power.h index 2093c3a..23c1703 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -128,6 +128,7 @@ struct snapshot_handle { #define data_of(handle)((handle).buffer + (handle).buf_offset) extern unsigned int snapshot_additional_pages(struct zone *zone); +extern unsigned long snapshot_get_image_size(void); extern int snapshot_read_next(struct snapshot_handle *handle, size_t count); extern int snapshot_write_next(struct snapshot_handle *handle, size_t count); extern void snapshot_write_finalize(struct snapshot_handle *handle); @@ -158,7 +159,8 @@ struct resume_swap_area { #define SNAPSHOT_PMOPS _IOW(SNAPSHOT_IOC_MAGIC, 12, unsigned int) #define SNAPSHOT_SET_SWAP_AREA _IOW(SNAPSHOT_IOC_MAGIC, 13, \ struct resume_swap_area) -#define SNAPSHOT_IOC_MAXNR 13 +#define SNAPSHOT_GET_IMAGE_SIZE_IOR(SNAPSHOT_IOC_MAGIC, 14, loff_t) +#define SNAPSHOT_IOC_MAXNR 14 #define PMOPS_PREPARE 1 #define PMOPS_ENTER2 diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 78039b4..c5ce0f3 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -1264,12 +1264,17 @@ static char *check_image_kernel(struct swsusp_info *info) } #endif /* CONFIG_ARCH_HIBERNATION_HEADER */ +unsigned long snapshot_get_image_size(void) +{ + return nr_copy_pages + nr_meta_pages + 1; +} + static int init_header(struct swsusp_info *info) { memset(info, 0, sizeof(struct swsusp_info)); info->num_physpages = num_physpages; info->image_pages = nr_copy_pages; - info->pages = nr_copy_pages + nr_meta_pages + 1; + info->pages = snapshot_get_image_size(); info->size = info->pages; info->size <<= PAGE_SHIFT; return init_header_complete(info); diff --git a/kernel/power/user.c b/kernel/power/user.c index 5bd321b..88aac26 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -133,7 +133,7 @@ static int snapshot_ioctl(struct inode *inode, struct file *filp, { int error = 0; struct snapshot_data *data; - loff_t avail; + loff_t size; sector_t offset; if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC) @@ -210,10 +210,20 @@ static int sna
suspend patches for 2.6.25-rc0
This is the suspend queue for 2.6.25 -- freshly re-based to the HEAD of Linus' tree. I plan to send a pull request tomorrow. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/37] Hibernation: Rework platform support ioctls (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Modify the hibernation userland interface by adding two new ioctls to it, SNAPSHOT_PLATFORM_SUPPORT and SNAPSHOT_POWER_OFF, that can be used, respectively, to switch the hibernation platform support on/off and to make the kernel transition the system to the hibernation state (eg. ACPI S4) using the platform (eg. ACPI) driver. These ioctls are intended to replace the misdesigned SNAPSHOT_PMOPS ioctl, which from now is regarded as obsolete and will be removed in the future. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/userland-swsusp.txt | 24 -- kernel/power/power.h|9 ++ kernel/power/user.c | 39 +++ 3 files changed, 38 insertions(+), 34 deletions(-) diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index 32f1874..381e9c0 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -85,6 +85,12 @@ SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in recommended to always use this call, because the code to set the resume partition may be removed from future kernels +SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support, + depending on the argument value (enable, if the argument is nonzero) + +SNAPSHOT_POWER_OFF - make the kernel transition the system to the hibernation + state (eg. ACPI S4) using the platform (eg. ACPI) driver + SNAPSHOT_S2RAM - suspend to RAM; using this call causes the kernel to immediately enter the suspend-to-RAM state, so this call must always be preceded by the SNAPSHOT_FREEZE call and it is also necessary @@ -95,24 +101,6 @@ SNAPSHOT_S2RAM - suspend to RAM; using this call causes the kernel to to resume the system from RAM if there's enough battery power or restore its state on the basis of the saved suspend image otherwise) -SNAPSHOT_PMOPS - enable the usage of the hibernation_ops->prepare, - hibernate_ops->enter and hibernation_ops->finish methods (the in-kernel - swsusp knows these as the "platform method") which are needed on many - machines to (among others) speed up the resume by letting the BIOS skip - some steps or to let the system recognise the correct state of the - hardware after the resume (in particular on many machines this ensures - that unplugged AC adapters get correctly detected and that kacpid does - not run wild after the resume). The last ioctl() argument can take one - of the three values, defined in kernel/power/power.h: - PMOPS_PREPARE - make the kernel carry out the - hibernation_ops->prepare() operation - PMOPS_ENTER - make the kernel power off the system by calling - hibernation_ops->enter() - PMOPS_FINISH - make the kernel carry out the - hibernation_ops->finish() operation - Note that the actual constants are misnamed because they surface - internal kernel implementation details that have changed. - The device's read() operation can be used to transfer the snapshot image from the kernel. It has the following limitations: - you cannot read() more than one virtual memory page at a time diff --git a/kernel/power/power.h b/kernel/power/power.h index 23c1703..6ca85fd 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -156,15 +156,12 @@ struct resume_swap_area { #define SNAPSHOT_FREE_SWAP_PAGES _IO(SNAPSHOT_IOC_MAGIC, 9) #define SNAPSHOT_SET_SWAP_FILE _IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int) #define SNAPSHOT_S2RAM _IO(SNAPSHOT_IOC_MAGIC, 11) -#define SNAPSHOT_PMOPS _IOW(SNAPSHOT_IOC_MAGIC, 12, unsigned int) #define SNAPSHOT_SET_SWAP_AREA _IOW(SNAPSHOT_IOC_MAGIC, 13, \ struct resume_swap_area) #define SNAPSHOT_GET_IMAGE_SIZE_IOR(SNAPSHOT_IOC_MAGIC, 14, loff_t) -#define SNAPSHOT_IOC_MAXNR 14 - -#define PMOPS_PREPARE 1 -#define PMOPS_ENTER2 -#define PMOPS_FINISH 3 +#define SNAPSHOT_PLATFORM_SUPPORT _IO(SNAPSHOT_IOC_MAGIC, 15) +#define SNAPSHOT_POWER_OFF _IO(SNAPSHOT_IOC_MAGIC, 16) +#define SNAPSHOT_IOC_MAXNR 16 /* If unset, the snapshot device cannot be open. */ extern atomic_t snapshot_device_available; diff --git a/kernel/power/user.c b/kernel/power/user.c index 88aac26..de3fb43 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -28,6 +28,18 @@ #include "power.h" +/* + * NOTE: The SNAPSHOT_PMOPS ioctl is obsolete and will be removed in the + * future. It is only preserved here for compatibility with ex
[PATCH 11/37] Hibernation: New testing facility (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Make it possible to test the hibernation core code with the help of the /sys/power/pm_test attribute introduced for suspend testing in the previous patch. Writing an appropriate string to this file causes the hibernation code to work in one of the test modes defined as follows: freezer - test the freezing of processes devices - test the freezing of processes and suspending of devices platform - test the freezing of processes, suspending of devices and platform global  control methods(*) processors - test the freezing of processes, suspending of devices, platform global  control methods(*) and the disabling of nonboot CPUs core - test the freezing of processes, suspending of devices, platform global  control methods(*), the disabling of nonboot CPUs and suspending of  platform/system devices (*) - the platform global control methods are only available on ACPI systems    and are only tested if the hibernation mode is set to "platform" Then, if a hibernation is started by normal means, the hibernation core will perform its normal operations up to the point indicated by given test level. Next, it will wait for 5 seconds and carry out the resume operations needed to transition the system back to the fully functional state. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/disk.c | 70 +++-- kernel/power/power.h |2 + 2 files changed, 57 insertions(+), 15 deletions(-) diff --git a/kernel/power/disk.c b/kernel/power/disk.c index 6597365..0866b16 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -70,6 +70,35 @@ void hibernation_set_ops(struct platform_hibernation_ops *ops) mutex_unlock(&pm_mutex); } +#ifdef CONFIG_PM_DEBUG +static void hibernation_debug_sleep(void) +{ + printk(KERN_INFO "hibernation debug: Waiting for 5 seconds.\n"); + mdelay(5000); +} + +static int hibernation_testmode(int mode) +{ + if (hibernation_mode == mode) { + hibernation_debug_sleep(); + return 1; + } + return 0; +} + +static int hibernation_test(int level) +{ + if (pm_test_level == level) { + hibernation_debug_sleep(); + return 1; + } + return 0; +} +#else /* !CONFIG_PM_DEBUG */ +static int hibernation_testmode(int mode) { return 0; } +static int hibernation_test(int level) { return 0; } +#endif /* !CONFIG_PM_DEBUG */ + /** * platform_start - tell the platform driver that we're starting * hibernation @@ -167,6 +196,10 @@ int create_image(int platform_mode) goto Enable_irqs; } + if (hibernation_test(TEST_CORE)) + goto Power_up; + + in_suspend = 1; save_processor_state(); error = swsusp_arch_suspend(); if (error) @@ -175,6 +208,7 @@ int create_image(int platform_mode) restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ @@ -211,24 +245,29 @@ int hibernation_snapshot(int platform_mode) if (error) goto Resume_console; - error = platform_pre_snapshot(platform_mode); - if (error) + if (hibernation_test(TEST_DEVICES)) goto Resume_devices; + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Finish; + error = disable_nonboot_cpus(); if (!error) { - if (hibernation_mode != HIBERNATION_TEST) { - in_suspend = 1; - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } else { - printk("swsusp debug: Waiting for 5 seconds.\n"); - mdelay(5000); - } + if (hibernation_test(TEST_CPUS)) + goto Enable_cpus; + + if (hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + + error = create_image(platform_mode); + /* Control returns here after successful restore */ } + Enable_cpus: enable_nonboot_cpus(); - Resume_devices: + Finish: platform_finish(platform_mode); + Resume_devices: device_resume(); Resume_console: resume_console(); @@ -406,11 +445,12 @@ int hibernate(void) if (error) goto Finish; - if (hibernation_mode == HIBERNATION_TESTPROC) { - printk("swsusp debug: Waiting for 5 seconds.\n")
[PATCH 13/37] PM: Make PM_TRACE more architecture independent
From: Johannes Berg <[EMAIL PROTECTED]> When trying to debug a suspend failure I started implementing PM_TRACE for powerpc. I then noticed that I'm debugging a suspend failure and so PM_TRACE isn't useful at all, but thought that nonetheless this could be useful in the future. Basically, to support PM_TRACE, you add a Kconfig option that selects PM_TRACE and provides the infrastructure as per the help text of PM_TRACE. Signed-off-by: Johannes Berg <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/base/power/Makefile |2 +- kernel/power/Kconfig| 23 ++- 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/Makefile b/drivers/base/power/Makefile index de28dfd..911208b 100644 --- a/drivers/base/power/Makefile +++ b/drivers/base/power/Makefile @@ -1,6 +1,6 @@ obj-$(CONFIG_PM) += sysfs.o obj-$(CONFIG_PM_SLEEP) += main.o -obj-$(CONFIG_PM_TRACE) += trace.o +obj-$(CONFIG_PM_TRACE_RTC) += trace.o ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG ccflags-$(CONFIG_PM_VERBOSE) += -DDEBUG diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig index 8e186c6..06a08f7 100644 --- a/kernel/power/Kconfig +++ b/kernel/power/Kconfig @@ -44,9 +44,30 @@ config PM_VERBOSE ---help--- This option enables verbose messages from the Power Management code. +config CAN_PM_TRACE + def_bool y + depends on PM_DEBUG && PM_SLEEP && EXPERIMENTAL + config PM_TRACE + bool + help + This enables code to save the last PM event point across + reboot. The architecture needs to support this, x86 for + example does by saving things in the RTC, see below. + + The architecture specific code must provide the extern + functions from as well as the + header with a TRACE_RESUME() macro. + + The way the information is presented is architecture- + dependent, x86 will print the information during a + late_initcall. + +config PM_TRACE_RTC bool "Suspend/resume event tracing" - depends on PM_DEBUG && X86 && PM_SLEEP && EXPERIMENTAL + depends on CAN_PM_TRACE + depends on X86 + select PM_TRACE default n ---help--- This enables some cheesy code to save the last PM event point in the -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/37] Hibernation: Mark SNAPSHOT_SET_SWAP_FILE ioctl as deprecated (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Mark the SNAPSHOT_SET_SWAP_FILE ioctl belonging to the hibernation userland interface as deprecated. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/userland-swsusp.txt | 14 ++ kernel/power/power.h|1 - kernel/power/user.c |9 + 3 files changed, 7 insertions(+), 17 deletions(-) diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index 381e9c0..0785500 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -67,23 +67,13 @@ SNAPSHOT_GET_SWAP_PAGE - allocate a swap page from the resume partition SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated with SNAPSHOT_GET_SWAP_PAGE -SNAPSHOT_SET_SWAP_FILE - set the resume partition (the last ioctl() argument - should specify the device's major and minor numbers in the old - two-byte format, as returned by the stat() function in the .st_rdev - member of the stat structure) - SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in units) from the beginning of the partition at which the swap header is located (the last ioctl() argument should point to a struct resume_swap_area, as defined in kernel/power/power.h, containing the - resume device specification, as for the SNAPSHOT_SET_SWAP_FILE ioctl(), - and the offset); for swap partitions the offset is always 0, but it is - different to zero for swap files (please see + resume device specification and the offset); for swap partitions the + offset is always 0, but it is different from zero for swap files (see Documentation/swsusp-and-swap-files.txt for details). - The SNAPSHOT_SET_SWAP_AREA ioctl() is considered as a replacement for - SNAPSHOT_SET_SWAP_FILE which is regarded as obsolete. It is - recommended to always use this call, because the code to set the resume - partition may be removed from future kernels SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support, depending on the argument value (enable, if the argument is nonzero) diff --git a/kernel/power/power.h b/kernel/power/power.h index 6ca85fd..8837ea3 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -154,7 +154,6 @@ struct resume_swap_area { #define SNAPSHOT_AVAIL_SWAP_IOR(SNAPSHOT_IOC_MAGIC, 7, void *) #define SNAPSHOT_GET_SWAP_PAGE _IOR(SNAPSHOT_IOC_MAGIC, 8, void *) #define SNAPSHOT_FREE_SWAP_PAGES _IO(SNAPSHOT_IOC_MAGIC, 9) -#define SNAPSHOT_SET_SWAP_FILE _IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int) #define SNAPSHOT_S2RAM _IO(SNAPSHOT_IOC_MAGIC, 11) #define SNAPSHOT_SET_SWAP_AREA _IOW(SNAPSHOT_IOC_MAGIC, 13, \ struct resume_swap_area) diff --git a/kernel/power/user.c b/kernel/power/user.c index de3fb43..5e866e0 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -29,10 +29,11 @@ #include "power.h" /* - * NOTE: The SNAPSHOT_PMOPS ioctl is obsolete and will be removed in the - * future. It is only preserved here for compatibility with existing userland - * utilities. + * NOTE: The SNAPSHOT_SET_SWAP_FILE and SNAPSHOT_PMOPS ioctls are obsolete and + * will be removed in the future. They are only preserved here for + * compatibility with existing userland utilities. */ +#define SNAPSHOT_SET_SWAP_FILE _IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int) #define SNAPSHOT_PMOPS _IOW(SNAPSHOT_IOC_MAGIC, 12, unsigned int) #define PMOPS_PREPARE 1 @@ -260,7 +261,7 @@ static int snapshot_ioctl(struct inode *inode, struct file *filp, free_all_swap_pages(data->swap); break; - case SNAPSHOT_SET_SWAP_FILE: + case SNAPSHOT_SET_SWAP_FILE: /* This ioctl is deprecated */ if (!swsusp_swap_in_use()) { /* * User space encodes device types as two-byte values, -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/37] ACPI: Fix mismerge in acpi_hibernation_finish
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Some code in acpi_hibernation_finish() was moved to acpi_hibernation_leave(), but the old copy had been left (it's harmless, but also useless). Remove it. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/sleep/main.c |5 - 1 files changed, 0 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index 2c0b663..cbfa058 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -267,11 +267,6 @@ static void acpi_hibernation_leave(void) static void acpi_hibernation_finish(void) { - /* -* If ACPI is not enabled by the BIOS and the boot kernel, we need to -* enable it here. -*/ - acpi_enable(); acpi_disable_wakeup_device(ACPI_STATE_S4); acpi_leave_sleep_state(ACPI_STATE_S4); -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/37] Hibernation: Move function prototypes to header
From: Adrian Bunk <[EMAIL PROTECTED]> This patch moves the prototypes of count_highmem_pages() and restore_highmem() to kernel/power/power.h Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/power.h|8 kernel/power/snapshot.c |1 - kernel/power/swsusp.c |8 3 files changed, 8 insertions(+), 9 deletions(-) diff --git a/kernel/power/power.h b/kernel/power/power.h index ef90605..c5321eb 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -180,3 +180,11 @@ static inline int pm_notifier_call_chain(unsigned long val) return (blocking_notifier_call_chain(&pm_chain_head, val, NULL) == NOTIFY_BAD) ? -EINVAL : 0; } + +#ifdef CONFIG_HIGHMEM +unsigned int count_highmem_pages(void); +int restore_highmem(void); +#else +static inline unsigned int count_highmem_pages(void) { return 0; } +static inline int restore_highmem(void) { return 0; } +#endif diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index c5ce0f3..1ec3ecc 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -872,7 +872,6 @@ unsigned int count_highmem_pages(void) } #else static inline void *saveable_highmem_page(unsigned long pfn) { return NULL; } -static inline unsigned int count_highmem_pages(void) { return 0; } #endif /* CONFIG_HIGHMEM */ /** diff --git a/kernel/power/swsusp.c b/kernel/power/swsusp.c index e1722d3..605c536 100644 --- a/kernel/power/swsusp.c +++ b/kernel/power/swsusp.c @@ -64,14 +64,6 @@ unsigned long image_size = 500 * 1024 * 1024; int in_suspend __nosavedata = 0; -#ifdef CONFIG_HIGHMEM -unsigned int count_highmem_pages(void); -int restore_highmem(void); -#else -static inline int restore_highmem(void) { return 0; } -static inline unsigned int count_highmem_pages(void) { return 0; } -#endif - /** * The following functions are used for tracing the allocated * swap pages, so that they can be freed in case of an error. -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 23/37] Suspend: Clean up Kconfig (V2)
From: Johannes Berg <[EMAIL PROTECTED]> This cleans up the suspend Kconfig and removes the need to declare centrally which architectures support suspend. All architectures that currently support suspend are modified accordingly. Signed-off-by: Johannes Berg <[EMAIL PROTECTED]> Acked-by: Russell King <[EMAIL PROTECTED]> Acked-by: Paul Mackerras <[EMAIL PROTECTED]> Acked-by: Ralf Baechle <[EMAIL PROTECTED]> Acked-by: Paul Mundt <[EMAIL PROTECTED]> Cc: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- arch/arm/Kconfig |3 +++ arch/blackfin/Kconfig |4 arch/frv/Kconfig |5 + arch/mips/Kconfig |4 arch/powerpc/Kconfig |4 arch/sh/Kconfig |4 arch/x86/Kconfig |4 kernel/power/Kconfig | 21 +++-- 8 files changed, 31 insertions(+), 18 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 77201d3..a00d8b9 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1035,6 +1035,9 @@ menu "Power management options" source "kernel/power/Kconfig" +config ARCH_SUSPEND_POSSIBLE + def_bool y + endmenu source "net/Kconfig" diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig index fc7ca86..4802eb7 100644 --- a/arch/blackfin/Kconfig +++ b/arch/blackfin/Kconfig @@ -898,6 +898,10 @@ endmenu menu "Power management options" source "kernel/power/Kconfig" +config ARCH_SUSPEND_POSSIBLE + def_bool y + depends on !SMP + choice prompt "Select PM Wakeup Event Source" default PM_WAKEUP_GPIO_BY_SIC_IWR diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig index 43153e7..2e25b95 100644 --- a/arch/frv/Kconfig +++ b/arch/frv/Kconfig @@ -357,6 +357,11 @@ source "drivers/pcmcia/Kconfig" #should probably wait a while. menu "Power management options" + +config ARCH_SUSPEND_POSSIBLE + def_bool y + depends on !SMP + source kernel/power/Kconfig endmenu diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 4fad0a3..e387f3a 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -2086,6 +2086,10 @@ endmenu menu "Power management options" +config ARCH_SUSPEND_POSSIBLE + def_bool y + depends on !SMP + source "kernel/power/Kconfig" endmenu diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 68f0cf7..824140d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -166,6 +166,10 @@ config ARCH_HIBERNATION_POSSIBLE depends on (PPC64 && HIBERNATE_64) || (PPC32 && HIBERNATE_32) default y +config ARCH_SUSPEND_POSSIBLE + def_bool y + depends on ADB_PMU || PPC_EFIKA || PPC_LITE5200 + config PPC_DCR_NATIVE bool default n diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index 1cd9c8f..b30c4c3 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -882,6 +882,10 @@ endmenu menu "Power management options (EXPERIMENTAL)" depends on EXPERIMENTAL && SYS_SUPPORTS_PM +config ARCH_SUSPEND_POSSIBLE + def_bool y + depends on !SMP + source kernel/power/Kconfig endmenu diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9c511dd..340e81a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -113,6 +113,10 @@ config ARCH_HIBERNATION_POSSIBLE def_bool y depends on !SMP || !X86_VOYAGER +config ARCH_SUSPEND_POSSIBLE + def_bool y + depends on !X86_VOYAGER + config ZONE_DMA32 bool default X86_64 diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig index fd76d54..f8153fd 100644 --- a/kernel/power/Kconfig +++ b/kernel/power/Kconfig @@ -85,7 +85,7 @@ config PM_TRACE_RTC config PM_SLEEP_SMP bool depends on SMP - depends on SUSPEND_SMP_POSSIBLE || ARCH_HIBERNATION_POSSIBLE + depends on ARCH_SUSPEND_POSSIBLE || ARCH_HIBERNATION_POSSIBLE depends on PM_SLEEP select HOTPLUG_CPU default y @@ -95,29 +95,14 @@ config PM_SLEEP depends on SUSPEND || HIBERNATION default y -config SUSPEND_UP_POSSIBLE - bool - depends on (X86 && !X86_VOYAGER) || PPC || ARM || BLACKFIN || MIPS \ - || SUPERH || FRV - depends on !SMP - default y - -config SUSPEND_SMP_POSSIBLE - bool - depends on (X86 && !X86_VOYAGER) \ - || (PPC && (PPC_PSERIES || PPC_PMAC)) || ARM - depends on SMP - default y - config SUSPEND bool "Suspend to RAM and standby" - depends on PM - depends on SUSPEND_UP_POSSIBLE || SUSPEND_SMP_POSSIBLE + depends on PM && ARCH_SUSPEND_POSSIBLE default y ---help--- Allow the system to enter sleep states in which main memory is po
[PATCH 08/37] Hibernation: Add PM_RESTORE_PREPARE and PM_POST_RESTORE notifiers (rev. 2)
From: Alan Stern <[EMAIL PROTECTED]> Add PM_RESTORE_PREPARE and PM_POST_RESTORE notifiers to the PM core, to be used in analogy with the existing PM_HIBERNATION_PREPARE and PM_POST_HIBERNATION notifiers. Signed-off-by: Alan Stern <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: "Rafael J. Wysocki" <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/notifiers.txt |8 include/linux/notifier.h |2 ++ kernel/power/disk.c |5 + kernel/power/user.c | 31 +++ 4 files changed, 34 insertions(+), 12 deletions(-) diff --git a/Documentation/power/notifiers.txt b/Documentation/power/notifiers.txt index 9293e4b..ae1b7ec 100644 --- a/Documentation/power/notifiers.txt +++ b/Documentation/power/notifiers.txt @@ -28,6 +28,14 @@ PM_POST_HIBERNATION The system memory state has been restored from a hibernation. Device drivers' .resume() callbacks have been executed and tasks have been thawed. +PM_RESTORE_PREPARE The system is going to restore a hibernation image. + If all goes well the restored kernel will issue a + PM_POST_HIBERNATION notification. + +PM_POST_RESTOREAn error occurred during the hibernation restore. + Device drivers' .resume() callbacks have been executed + and tasks have been thawed. + PM_SUSPEND_PREPARE The system is preparing for a suspend. PM_POST_SUSPENDThe system has just resumed or an error occured during diff --git a/include/linux/notifier.h b/include/linux/notifier.h index 5dfbc68..f4df400 100644 --- a/include/linux/notifier.h +++ b/include/linux/notifier.h @@ -228,6 +228,8 @@ static inline int notifier_to_errno(int ret) #define PM_POST_HIBERNATION0x0002 /* Hibernation finished */ #define PM_SUSPEND_PREPARE 0x0003 /* Going to suspend the system */ #define PM_POST_SUSPEND0x0004 /* Suspend finished */ +#define PM_RESTORE_PREPARE 0x0005 /* Going to restore a saved image */ +#define PM_POST_RESTORE0x0006 /* Restore failed */ /* Console keyboard events. * Note: KBD_KEYCODE is always sent before KBD_UNBOUND_KEYCODE, KBD_UNICODE and diff --git a/kernel/power/disk.c b/kernel/power/disk.c index b138b43..6597365 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -499,6 +499,10 @@ static int software_resume(void) goto Unlock; } + error = pm_notifier_call_chain(PM_RESTORE_PREPARE); + if (error) + goto Finish; + error = create_basic_memory_bitmaps(); if (error) goto Finish; @@ -522,6 +526,7 @@ static int software_resume(void) Done: free_basic_memory_bitmaps(); Finish: + pm_notifier_call_chain(PM_POST_RESTORE); atomic_inc(&snapshot_device_available); /* For success case, the suspend path will release the lock */ Unlock: diff --git a/kernel/power/user.c b/kernel/power/user.c index b902a7e..f5512cb 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -67,6 +67,7 @@ atomic_t snapshot_device_available = ATOMIC_INIT(1); static int snapshot_open(struct inode *inode, struct file *filp) { struct snapshot_data *data; + int error; if (!atomic_add_unless(&snapshot_device_available, -1, 0)) return -EBUSY; @@ -87,9 +88,19 @@ static int snapshot_open(struct inode *inode, struct file *filp) data->swap = swsusp_resume_device ? swap_type_of(swsusp_resume_device, 0, NULL) : -1; data->mode = O_RDONLY; + error = pm_notifier_call_chain(PM_RESTORE_PREPARE); + if (error) + pm_notifier_call_chain(PM_POST_RESTORE); } else { data->swap = -1; data->mode = O_WRONLY; + error = pm_notifier_call_chain(PM_HIBERNATION_PREPARE); + if (error) + pm_notifier_call_chain(PM_POST_HIBERNATION); + } + if (error) { + atomic_inc(&snapshot_device_available); + return error; } data->frozen = 0; data->ready = 0; @@ -111,6 +122,8 @@ static int snapshot_release(struct inode *inode, struct file *filp) thaw_processes(); mutex_unlock(&pm_mutex); } + pm_notifier_call_chain(data->mode == O_WRONLY ? + PM_POST_HIBERNATION : PM_POST_RESTORE); atomic_inc(&snapshot_device_available); return 0; } @@ -174,18 +187,13 @@ static int snapshot_ioctl(struct inode *inode, struct file *filp, if (data->frozen) break;
[PATCH 09/37] Suspend: Testing facility (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Introduce sysfs attribute /sys/power/pm_test allowing one to test the suspend core code.  Namely, writing one of the strings: freezer devices platform processors core to this file causes the suspend code to work in one of the test modes defined as follows: freezer - test the freezing of processes devices - test the freezing of processes and suspending of devices platform - test the freezing of processes, suspending of devices and platform global  control methods(*) processors - test the freezing of processes, suspending of devices, platform global  control methods and the disabling of nonboot CPUs core - test the freezing of processes, suspending of devices, platform global  control methods, the disabling of nonboot CPUs and suspending of  platform/system devices (*) These are ACPI global control methods on ACPI systems Then, if a suspend is started by normal means, the suspend core will perform its normal operations up to the point indicated by given test level.  Next, it will wait for 5 seconds and carry out the resume operations needed to transition the system back to the fully functional state. Writing "none" to /sys/power/pm_test turns the testing off. When open for reading, /sys/power/pm_test contains a space-separated list of all available tests (including "none" that represents the normal functionality) in which the current test level is indicated by square brackets. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c | 108 + kernel/power/power.h | 18 2 files changed, 117 insertions(+), 9 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index efc0836..84e1ae6 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -31,6 +31,79 @@ DEFINE_MUTEX(pm_mutex); unsigned int pm_flags; EXPORT_SYMBOL(pm_flags); +#ifdef CONFIG_PM_DEBUG +int pm_test_level = TEST_NONE; + +static int suspend_test(int level) +{ + if (pm_test_level == level) { + printk(KERN_INFO "suspend debug: Waiting for 5 seconds.\n"); + mdelay(5000); + return 1; + } + return 0; +} + +static const char * const pm_tests[__TEST_AFTER_LAST] = { + [TEST_NONE] = "none", + [TEST_CORE] = "core", + [TEST_CPUS] = "processors", + [TEST_PLATFORM] = "platform", + [TEST_DEVICES] = "devices", + [TEST_FREEZER] = "freezer", +}; + +static ssize_t pm_test_show(struct kset *kset, char *buf) +{ + char *s = buf; + int level; + + for (level = TEST_FIRST; level <= TEST_MAX; level++) + if (pm_tests[level]) { + if (level == pm_test_level) + s += sprintf(s, "[%s] ", pm_tests[level]); + else + s += sprintf(s, "%s ", pm_tests[level]); + } + + if (s != buf) + /* convert the last space to a newline */ + *(s-1) = '\n'; + + return (s - buf); +} + +static ssize_t pm_test_store(struct kset *kset, const char *buf, size_t n) +{ + const char * const *s; + int level; + char *p; + int len; + int error = -EINVAL; + + p = memchr(buf, '\n', n); + len = p ? p - buf : n; + + mutex_lock(&pm_mutex); + + level = TEST_FIRST; + for (s = &pm_tests[level]; level <= TEST_MAX; s++, level++) + if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) { + pm_test_level = level; + error = 0; + break; + } + + mutex_unlock(&pm_mutex); + + return error ? error : n; +} + +power_attr(pm_test); +#else /* !CONFIG_PM_DEBUG */ +static inline int suspend_test(int level) { return 0; } +#endif /* !CONFIG_PM_DEBUG */ + #ifdef CONFIG_SUSPEND /* This is just an arbitrary number */ @@ -136,7 +209,10 @@ static int suspend_enter(suspend_state_t state) printk(KERN_ERR "Some devices failed to power down\n"); goto Done; } - error = suspend_ops->enter(state); + + if (!suspend_test(TEST_CORE)) + error = suspend_ops->enter(state); + device_power_up(); Done: arch_suspend_enable_irqs(); @@ -167,16 +243,25 @@ int suspend_devices_and_enter(suspend_state_t state) printk(KERN_ERR "Some devices failed to suspend\n"); goto Resume_console; } + + if (suspend_test(TEST_DEVICES)) + goto Resume_devices; + if (suspend_ops->prepare) { error =
[PATCH 10/37] suspend: build fix responding to 2.6.25 kset change
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index 84e1ae6..fc717b8 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -53,7 +53,8 @@ static const char * const pm_tests[__TEST_AFTER_LAST] = { [TEST_FREEZER] = "freezer", }; -static ssize_t pm_test_show(struct kset *kset, char *buf) +static ssize_t pm_test_show(struct kobject *kobj, struct kobj_attribute *attr, + char *buf) { char *s = buf; int level; @@ -73,7 +74,8 @@ static ssize_t pm_test_show(struct kset *kset, char *buf) return (s - buf); } -static ssize_t pm_test_store(struct kset *kset, const char *buf, size_t n) +static ssize_t pm_test_store(struct kobject *kobj, struct kobj_attribute *attr, + const char *buf, size_t n) { const char * const *s; int level; @@ -104,6 +106,7 @@ power_attr(pm_test); static inline int suspend_test(int level) { return 0; } #endif /* !CONFIG_PM_DEBUG */ + #ifdef CONFIG_SUSPEND /* This is just an arbitrary number */ -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 24/37] ACPI: clear GPE earily in resume to avoid warning
From: Shaohua Li <[EMAIL PROTECTED]> Wakeup GPE hasn't a handler. If system is waked up by such GPE like a USB hotplug, I saw a lot of error reporting the GPE hasn't handler. acpi_leave_sleep_state will clear the GPE but it's too late, we should do it before interrupt is re-enabled. Signed-off-by: Shaohua Li <[EMAIL PROTECTED]> Acked-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/sleep/main.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index cbfa058..96d23b3 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -146,6 +146,13 @@ static int acpi_pm_enter(suspend_state_t pm_state) if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3)) acpi_clear_event(ACPI_EVENT_POWER_BUTTON); + /* +* Disable and clear GPE status before interrupt is enabled. Some GPEs +* (like wakeup GPE) haven't handler, this can avoid such GPE misfire. +* acpi_leave_sleep_state will reenable specific GPEs later +*/ + acpi_hw_disable_all_gpes(); + local_irq_restore(flags); printk(KERN_DEBUG "Back to C!\n"); -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 19/37] Hibernation: Remove unnecessary variable declaration
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Remove the unnecessary extern declaration of resume_file[] from kernel/power/swap.c . Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/swap.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index 917aba1..ef41440 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -28,8 +28,6 @@ #include "power.h" -extern char resume_file[]; - #define SWSUSP_SIG "S1SUSPEND" struct swsusp_header { -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/37] PM: Suspend/hibernation debug documentation update (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Update the suspend/hibernation debugging and testing documentation to describe the newly introduced testing facilities. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/basic-pm-debugging.txt | 216 Documentation/power/drivers-testing.txt| 30 +++-- 2 files changed, 174 insertions(+), 72 deletions(-) diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt index 57aef2f..1555001 100644 --- a/Documentation/power/basic-pm-debugging.txt +++ b/Documentation/power/basic-pm-debugging.txt @@ -1,45 +1,111 @@ -Debugging suspend and resume +Debugging hibernation and suspend (C) 2007 Rafael J. Wysocki <[EMAIL PROTECTED]>, GPL -1. Testing suspend to disk (STD) +1. Testing hibernation (aka suspend to disk or STD) -To verify that the STD works, you can try to suspend in the "reboot" mode: +To check if hibernation works, you can try to hibernate in the "reboot" mode: # echo reboot > /sys/power/disk # echo disk > /sys/power/state -and the system should suspend, reboot, resume and get back to the command prompt -where you have started the transition. If that happens, the STD is most likely -to work correctly, but you need to repeat the test at least a couple of times in -a row for confidence. This is necessary, because some problems only show up on -a second attempt at suspending and resuming the system. You should also test -the "platform" and "shutdown" modes of suspend: +and the system should create a hibernation image, reboot, resume and get back to +the command prompt where you have started the transition. If that happens, +hibernation is most likely to work correctly. Still, you need to repeat the +test at least a couple of times in a row for confidence. [This is necessary, +because some problems only show up on a second attempt at suspending and +resuming the system.] Moreover, hibernating in the "reboot" and "shutdown" +modes causes the PM core to skip some platform-related callbacks which on ACPI +systems might be necessary to make hibernation work. Thus, if you machine fails +to hibernate or resume in the "reboot" mode, you should try the "platform" mode: # echo platform > /sys/power/disk # echo disk > /sys/power/state -or +which is the default and recommended mode of hibernation. + +Unfortunately, the "platform" mode of hibernation does not work on some systems +with broken BIOSes. In such cases the "shutdown" mode of hibernation might +work: # echo shutdown > /sys/power/disk # echo disk > /sys/power/state -in which cases you will have to press the power button to make the system -resume. If that does not work, you will need to identify what goes wrong. +(it is similar to the "reboot" mode, but it requires you to press the power +button to make the system resume). + +If neither "platform" nor "shutdown" hibernation mode works, you will need to +identify what goes wrong. + +a) Test modes of hibernation + +To find out why hibernation fails on your system, you can use a special testing +facility available if the kernel is compiled with CONFIG_PM_DEBUG set. Then, +there is the file /sys/power/pm_test that can be used to make the hibernation +core run in a test mode. There are 5 test modes available: + +freezer +- test the freezing of processes + +devices +- test the freezing of processes and suspending of devices -a) Test mode of STD +platform +- test the freezing of processes, suspending of devices and platform + global control methods(*) -To verify if there are any drivers that cause problems you can run the STD -in the test mode: +processors +- test the freezing of processes, suspending of devices, platform + global control methods(*) and the disabling of nonboot CPUs -# echo test > /sys/power/disk +core +- test the freezing of processes, suspending of devices, platform global + control methods(*), the disabling of nonboot CPUs and suspending of + platform/system devices + +(*) the platform global control methods are only available on ACPI systems +and are only tested if the hibernation mode is set to "platform" + +To use one of them it is necessary to write the corresponding string to +/sys/power/pm_test (eg. "devices" to test the freezing of processes and +suspending devices) and issue the standard hibernation commands. For example, +to use the "devices" test mode along with the "platform" mode of hibernation, +you should do the following: + +# echo devices > /sys/power/pm_test +# echo platform > /sys/power/disk # echo disk > /sys/power/state -in which case the system should free
[PATCH 14/37] PM: Convert PM notifiers to out-of-line code
From: Alan Stern <[EMAIL PROTECTED]> This patch (as1008b) converts the PM notifier routines from inline calls to out-of-line code. It also prevents pm_chain_head from being created when CONFIG_PM_SLEEP isn't enabled, and EXPORTs the notifier registration and unregistration routines. Signed-off-by: Alan Stern <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- include/linux/suspend.h | 13 ++--- kernel/power/main.c | 28 ++-- kernel/power/power.h| 12 3 files changed, 32 insertions(+), 21 deletions(-) diff --git a/include/linux/suspend.h b/include/linux/suspend.h index 40280df..51283e0 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -213,17 +213,8 @@ void save_processor_state(void); void restore_processor_state(void); /* kernel/power/main.c */ -extern struct blocking_notifier_head pm_chain_head; - -static inline int register_pm_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_register(&pm_chain_head, nb); -} - -static inline int unregister_pm_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_unregister(&pm_chain_head, nb); -} +extern int register_pm_notifier(struct notifier_block *nb); +extern int unregister_pm_notifier(struct notifier_block *nb); #define pm_notifier(fn, pri) { \ static struct notifier_block fn##_nb = \ diff --git a/kernel/power/main.c b/kernel/power/main.c index fc717b8..0a9f269 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -24,13 +24,37 @@ #include "power.h" -BLOCKING_NOTIFIER_HEAD(pm_chain_head); - DEFINE_MUTEX(pm_mutex); unsigned int pm_flags; EXPORT_SYMBOL(pm_flags); +#ifdef CONFIG_PM_SLEEP + +/* Routines for PM-transition notifications */ + +static BLOCKING_NOTIFIER_HEAD(pm_chain_head); + +int register_pm_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&pm_chain_head, nb); +} +EXPORT_SYMBOL_GPL(register_pm_notifier); + +int unregister_pm_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(&pm_chain_head, nb); +} +EXPORT_SYMBOL_GPL(unregister_pm_notifier); + +int pm_notifier_call_chain(unsigned long val) +{ + return (blocking_notifier_call_chain(&pm_chain_head, val, NULL) + == NOTIFY_BAD) ? -EINVAL : 0; +} + +#endif /* CONFIG_PM_SLEEP */ + #ifdef CONFIG_PM_DEBUG int pm_test_level = TEST_NONE; diff --git a/kernel/power/power.h b/kernel/power/power.h index f9f0d4d..a9732fd 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -172,14 +172,10 @@ static inline int suspend_devices_and_enter(suspend_state_t state) } #endif /* !CONFIG_SUSPEND */ -/* kernel/power/common.c */ -extern struct blocking_notifier_head pm_chain_head; - -static inline int pm_notifier_call_chain(unsigned long val) -{ - return (blocking_notifier_call_chain(&pm_chain_head, val, NULL) - == NOTIFY_BAD) ? -EINVAL : 0; -} +#ifdef CONFIG_PM_SLEEP +/* kernel/power/main.c */ +extern int pm_notifier_call_chain(unsigned long val); +#endif #ifdef CONFIG_HIGHMEM unsigned int count_highmem_pages(void); -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 25/37] suspend: fix ia64 allmodconfig build
From: Rafael J. Wysocki <[EMAIL PROTECTED]> kernel/power/main.c:488: error: âpm_test_attrâ undeclared here (not in a function) Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c |7 ++- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index 5688168..050a607 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -53,10 +53,6 @@ int pm_notifier_call_chain(unsigned long val) == NOTIFY_BAD) ? -EINVAL : 0; } -#endif /* CONFIG_PM_SLEEP */ - -#ifdef CONFIG_SUSPEND - #ifdef CONFIG_PM_DEBUG int pm_test_level = TEST_NONE; @@ -132,6 +128,7 @@ power_attr(pm_test); static inline int suspend_test(int level) { return 0; } #endif /* !CONFIG_PM_DEBUG */ +#endif /* CONFIG_PM_SLEEP */ #ifdef CONFIG_SUSPEND @@ -495,7 +492,7 @@ static struct attribute * g[] = { #ifdef CONFIG_PM_TRACE &pm_trace_attr.attr, #endif -#ifdef CONFIG_PM_DEBUG +#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_PM_DEBUG) &pm_test_attr.attr, #endif NULL, -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 21/37] Hibernation: Update messages
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Make hibernation messages start with one common prefix "PM: " and use the word "hibernation" in the messages as a synonym of "suspend to disk". Turn some KERN_INFO messages into debug ones. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/disk.c | 28 +++- kernel/power/snapshot.c | 23 --- kernel/power/swap.c | 31 +-- kernel/power/swsusp.c |5 +++-- 4 files changed, 47 insertions(+), 40 deletions(-) diff --git a/kernel/power/disk.c b/kernel/power/disk.c index 3e24a20..64e42ab 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -191,8 +191,8 @@ int create_image(int platform_mode) */ error = device_power_down(PMSG_FREEZE); if (error) { - printk(KERN_ERR "Some devices failed to power down, " - KERN_ERR "aborting suspend\n"); + printk(KERN_ERR "PM: Some devices failed to power down, " + "aborting hibernation\n"); goto Enable_irqs; } @@ -203,7 +203,8 @@ int create_image(int platform_mode) save_processor_state(); error = swsusp_arch_suspend(); if (error) - printk(KERN_ERR "Error %d while creating the image\n", error); + printk(KERN_ERR "PM: Error %d creating hibernation image\n", + error); /* Restore control flow magically appears here */ restore_processor_state(); if (!in_suspend) @@ -289,7 +290,7 @@ static int resume_target_kernel(void) local_irq_disable(); error = device_power_down(PMSG_PRETHAW); if (error) { - printk(KERN_ERR "Some devices failed to power down, " + printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); goto Enable_irqs; } @@ -438,7 +439,7 @@ static void power_down(void) * Valid image is on the disk, if we continue we risk serious data * corruption after resume. */ - printk(KERN_CRIT "Please power me down manually\n"); + printk(KERN_CRIT "PM: Please power down manually\n"); while(1); } @@ -484,7 +485,7 @@ int hibernate(void) if (error) goto Exit; - printk("Syncing filesystems ... "); + printk(KERN_INFO "PM: Syncing filesystems ... "); sys_sync(); printk("done.\n"); @@ -560,10 +561,11 @@ static int software_resume(void) return -ENOENT; } swsusp_resume_device = name_to_dev_t(resume_file); - pr_debug("swsusp: Resume From Partition %s\n", resume_file); + pr_debug("PM: Resume from partition %s\n", resume_file); } else { - pr_debug("swsusp: Resume From Partition %d:%d\n", -MAJOR(swsusp_resume_device), MINOR(swsusp_resume_device)); + pr_debug("PM: Resume from partition %d:%d\n", + MAJOR(swsusp_resume_device), + MINOR(swsusp_resume_device)); } if (noresume) { @@ -575,7 +577,7 @@ static int software_resume(void) return 0; } - pr_debug("PM: Checking swsusp image.\n"); + pr_debug("PM: Checking hibernation image.\n"); error = swsusp_check(); if (error) goto Unlock; @@ -601,7 +603,7 @@ static int software_resume(void) goto Done; } - pr_debug("PM: Reading swsusp image.\n"); + pr_debug("PM: Reading hibernation image.\n"); error = swsusp_read(&flags); if (!error) @@ -728,7 +730,7 @@ static ssize_t disk_store(struct kobject *kobj, struct kobj_attribute *attr, error = -EINVAL; if (!error) - pr_debug("PM: suspend-to-disk mode set to '%s'\n", + pr_debug("PM: Hibernation mode set to '%s'\n", hibernation_modes[mode]); mutex_unlock(&pm_mutex); return error ? error : n; @@ -760,7 +762,7 @@ static ssize_t resume_store(struct kobject *kobj, struct kobj_attribute *attr, mutex_lock(&pm_mutex); swsusp_resume_device = res; mutex_unlock(&pm_mutex); - printk("Attempting manual resume\n"); + printk(KERN_INFO "PM: Starting manual resume from disk\n"); noresume = 0; software_resume(); r
[PATCH 31/37] ACPI hibernation: Call _PTS before suspending devices
From: Rafael J. Wysocki <[EMAIL PROTECTED]> The ACPI 1.0 specification wants us to put devices into low power states after executing the _PTS global control method, while ACPI 2.0 and later want us to do that in the reverse order. The current hibernation code follows ACPI 2.0 in that respect which may cause some ACPI 1.0x systems to hang during hibernation (ref. http://bugzilla.kernel.org/show_bug.cgi?id=9528). Make the hibernation code execute _PTS before putting devices into low power states (ie. in accordance with ACPI 1.0x) with the possibility to override that using the 'acpi_new_pts_ordering' kernel command line option. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/sleep/main.c | 36 +--- 1 files changed, 25 insertions(+), 11 deletions(-) diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index 31e8e58..10db899 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -283,22 +283,34 @@ static struct dmi_system_id __initdata acpisleep_dmi_table[] = { #ifdef CONFIG_HIBERNATION static int acpi_hibernation_begin(void) { + int error; + acpi_target_sleep_state = ACPI_STATE_S4; - return 0; + if (new_pts_ordering) + return 0; + + error = acpi_sleep_prepare(ACPI_STATE_S4); + if (error) + acpi_target_sleep_state = ACPI_STATE_S0; + else + acpi_sleep_finish_wake_up = true; + + return error; } static int acpi_hibernation_prepare(void) { - int error; - - error = acpi_sleep_prepare(ACPI_STATE_S4); - if (error) - return error; + if (new_pts_ordering) { + int error = acpi_sleep_prepare(ACPI_STATE_S4); - if (!ACPI_SUCCESS(acpi_hw_disable_all_gpes())) - error = -EFAULT; + if (error) { + acpi_target_sleep_state = ACPI_STATE_S0; + return error; + } + acpi_sleep_finish_wake_up = true; + } - return error; + return ACPI_SUCCESS(acpi_hw_disable_all_gpes()) ? 0 : -EFAULT; } static int acpi_hibernation_enter(void) @@ -339,15 +351,17 @@ static void acpi_hibernation_finish(void) acpi_set_firmware_waking_vector((acpi_physical_address) 0); acpi_target_sleep_state = ACPI_STATE_S0; + acpi_sleep_finish_wake_up = false; } static void acpi_hibernation_end(void) { /* * This is necessary in case acpi_hibernation_finish() is not called -* during a failing transition to the sleep state. +* directly during a failing transition to the sleep state. */ - acpi_target_sleep_state = ACPI_STATE_S0; + if (acpi_sleep_finish_wake_up) + acpi_hibernation_finish(); } static int acpi_hibernation_pre_restore(void) -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 22/37] Hibernation: Clean up Kconfig (V2)
From: Johannes Berg <[EMAIL PROTECTED]> This cleans up the hibernation Kconfig and removes the need to declare centrally which architectures support hibernation. All architectures that currently support hibernation are modified accordingly. Signed-off-by: Johannes Berg <[EMAIL PROTECTED]> Acked-by: Paul Mackerras <[EMAIL PROTECTED]> Cc: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- arch/powerpc/Kconfig | 14 -- arch/x86/Kconfig |4 kernel/power/Kconfig | 18 +++--- 3 files changed, 19 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 9c44af3..68f0cf7 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -151,9 +151,19 @@ config DEFAULT_UIMAGE config REDBOOT bool -config PPC64_SWSUSP +config HIBERNATE_32 bool - depends on PPC64 && (BROKEN || (PPC_PMAC64 && EXPERIMENTAL)) + depends on (PPC_PMAC && !SMP) || BROKEN + default y + +config HIBERNATE_64 + bool + depends on BROKEN || (PPC_PMAC64 && EXPERIMENTAL) + default y + +config ARCH_HIBERNATION_POSSIBLE + bool + depends on (PPC64 && HIBERNATE_64) || (PPC32 && HIBERNATE_32) default y config PPC_DCR_NATIVE diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 65b4491..9c511dd 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -109,6 +109,10 @@ config ARCH_SUPPORTS_OPROFILE select HAVE_KVM +config ARCH_HIBERNATION_POSSIBLE + def_bool y + depends on !SMP || !X86_VOYAGER + config ZONE_DMA32 bool default X86_64 diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig index 06a08f7..fd76d54 100644 --- a/kernel/power/Kconfig +++ b/kernel/power/Kconfig @@ -84,7 +84,8 @@ config PM_TRACE_RTC config PM_SLEEP_SMP bool - depends on SUSPEND_SMP_POSSIBLE || HIBERNATION_SMP_POSSIBLE + depends on SMP + depends on SUSPEND_SMP_POSSIBLE || ARCH_HIBERNATION_POSSIBLE depends on PM_SLEEP select HOTPLUG_CPU default y @@ -118,22 +119,9 @@ config SUSPEND powered and thus its contents are preserved, such as the suspend-to-RAM state (i.e. the ACPI S3 state). -config HIBERNATION_UP_POSSIBLE - bool - depends on X86 || PPC64_SWSUSP || PPC32 - depends on !SMP - default y - -config HIBERNATION_SMP_POSSIBLE - bool - depends on (X86 && !X86_VOYAGER) || PPC64_SWSUSP - depends on SMP - default y - config HIBERNATION bool "Hibernation (aka 'suspend to disk')" - depends on PM && SWAP - depends on HIBERNATION_UP_POSSIBLE || HIBERNATION_SMP_POSSIBLE + depends on PM && SWAP && ARCH_HIBERNATION_POSSIBLE ---help--- Enable the suspend to disk (STD) functionality, which is usually called "hibernation" in user interfaces. STD checkpoints the -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 28/37] ACPI: Separate disabling of GPEs from _PTS
From: Rafael J. Wysocki <[EMAIL PROTECTED]> The preparation to enter an ACPI system sleep state is now tied to the disabling of GPEs, but the GPEs should not be disabled before suspending devices. Since on ACPI 1.0x systems the _PTS global control method should be executed before suspending devices, we need to disable GPEs separately. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/hardware/hwsleep.c |4 drivers/acpi/sleep/main.c | 17 +++-- 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/acpi/hardware/hwsleep.c b/drivers/acpi/hardware/hwsleep.c index 13c93a1..fd1c4ba 100644 --- a/drivers/acpi/hardware/hwsleep.c +++ b/drivers/acpi/hardware/hwsleep.c @@ -229,10 +229,6 @@ acpi_status acpi_enter_sleep_state_prep(u8 sleep_state) "While executing method _SST")); } - /* Disable/Clear all GPEs */ - - status = acpi_hw_disable_all_gpes(); - return_ACPI_STATUS(status); } diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index fdd8139..198ff8a 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -91,10 +91,13 @@ static int acpi_pm_begin(suspend_state_t pm_state) static int acpi_pm_prepare(void) { - int error = acpi_sleep_prepare(acpi_target_sleep_state); + int error; + error = acpi_sleep_prepare(acpi_target_sleep_state); if (error) acpi_target_sleep_state = ACPI_STATE_S0; + else if (!ACPI_SUCCESS(acpi_hw_disable_all_gpes())) + error = -EFAULT; return error; } @@ -261,7 +264,16 @@ static int acpi_hibernation_start(void) static int acpi_hibernation_prepare(void) { - return acpi_sleep_prepare(ACPI_STATE_S4); + int error; + + error = acpi_sleep_prepare(ACPI_STATE_S4); + if (error) + return error; + + if (!ACPI_SUCCESS(acpi_hw_disable_all_gpes())) + error = -EFAULT; + + return error; } static int acpi_hibernation_enter(void) @@ -426,6 +438,7 @@ static void acpi_power_off_prepare(void) { /* Prepare to power off the system */ acpi_sleep_prepare(ACPI_STATE_S5); + acpi_hw_disable_all_gpes(); } static void acpi_power_off(void) -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 30/37] Hibernation: Introduce begin() and end() callbacks
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Introduce global hibernation callback .end() and rename global hibernation callback .start() to .begin(), in analogy with the recent modifications of the global suspend callbacks. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/sleep/main.c | 14 -- include/linux/suspend.h | 14 +- kernel/power/disk.c | 33 - 3 files changed, 45 insertions(+), 16 deletions(-) diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index c37c4ea..31e8e58 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -281,7 +281,7 @@ static struct dmi_system_id __initdata acpisleep_dmi_table[] = { #endif /* CONFIG_SUSPEND */ #ifdef CONFIG_HIBERNATION -static int acpi_hibernation_start(void) +static int acpi_hibernation_begin(void) { acpi_target_sleep_state = ACPI_STATE_S4; return 0; @@ -341,6 +341,15 @@ static void acpi_hibernation_finish(void) acpi_target_sleep_state = ACPI_STATE_S0; } +static void acpi_hibernation_end(void) +{ + /* +* This is necessary in case acpi_hibernation_finish() is not called +* during a failing transition to the sleep state. +*/ + acpi_target_sleep_state = ACPI_STATE_S0; +} + static int acpi_hibernation_pre_restore(void) { acpi_status status; @@ -356,7 +365,8 @@ static void acpi_hibernation_restore_cleanup(void) } static struct platform_hibernation_ops acpi_hibernation_ops = { - .start = acpi_hibernation_start, + .begin = acpi_hibernation_begin, + .end = acpi_hibernation_end, .pre_snapshot = acpi_hibernation_prepare, .finish = acpi_hibernation_finish, .prepare = acpi_hibernation_prepare, diff --git a/include/linux/suspend.h b/include/linux/suspend.h index a0b1dbb..646ce2d 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -136,14 +136,17 @@ extern void mark_free_pages(struct zone *zone); /** * struct platform_hibernation_ops - hibernation platform support * - * The methods in this structure allow a platform to override the default - * mechanism of shutting down the machine during a hibernation transition. + * The methods in this structure allow a platform to carry out special + * operations required by it during a hibernation transition. * - * All three methods must be assigned. + * All the methods below must be implemented. * - * @start: Tell the platform driver that we're starting hibernation. + * @begin: Tell the platform driver that we're starting hibernation. * Called right after shrinking memory and before freezing devices. * + * @end: Called by the PM core right after resuming devices, to indicate to + * the platform that the system has returned to the working state. + * * @pre_snapshot: Prepare the platform for creating the hibernation image. * Called right after devices have been frozen and before the nonboot * CPUs are disabled (runs with IRQs on). @@ -178,7 +181,8 @@ extern void mark_free_pages(struct zone *zone); * thawing devices (runs with IRQs on). */ struct platform_hibernation_ops { - int (*start)(void); + int (*begin)(void); + void (*end)(void); int (*pre_snapshot)(void); void (*finish)(void); int (*prepare)(void); diff --git a/kernel/power/disk.c b/kernel/power/disk.c index 64e42ab..53c22d9 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -54,8 +54,8 @@ static struct platform_hibernation_ops *hibernation_ops; void hibernation_set_ops(struct platform_hibernation_ops *ops) { - if (ops && !(ops->start && ops->pre_snapshot && ops->finish - && ops->prepare && ops->enter && ops->pre_restore + if (ops && !(ops->begin && ops->end && ops->pre_snapshot + && ops->prepare && ops->finish && ops->enter && ops->pre_restore && ops->restore_cleanup)) { WARN_ON(1); return; @@ -100,14 +100,25 @@ static int hibernation_test(int level) { return 0; } #endif /* !CONFIG_PM_DEBUG */ /** - * platform_start - tell the platform driver that we're starting + * platform_begin - tell the platform driver that we're starting * hibernation */ -static int platform_start(int platform_mode) +static int platform_begin(int platform_mode) { return (platform_mode && hibernation_ops) ? - hibernation_ops->start() : 0; + hibernation_ops->begin() : 0; +} + +/** + * platform_end - tell the platform driver that we've entered the + * working state + */ + +static void platform_end(int platform_mode) +{ + if (platform_mode &
[PATCH 32/37] ACPI: Print message before calling _PTS
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Make acpi_sleep_prepare() static and cause it to print a message specifying the ACPI system sleep state to be entered (helpful for debugging the suspend/hibernation code). Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/sleep/main.c |4 +++- drivers/acpi/sleep/sleep.h |2 -- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index 10db899..485de13 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -43,7 +43,7 @@ static int __init acpi_new_pts_ordering(char *str) __setup("acpi_new_pts_ordering", acpi_new_pts_ordering); #endif -int acpi_sleep_prepare(u32 acpi_state) +static int acpi_sleep_prepare(u32 acpi_state) { #ifdef CONFIG_ACPI_SLEEP /* do we have a wakeup address for S2 and S3? */ @@ -59,6 +59,8 @@ int acpi_sleep_prepare(u32 acpi_state) ACPI_FLUSH_CPU_CACHE(); acpi_enable_wakeup_device_prep(acpi_state); #endif + printk(KERN_INFO PREFIX "Preparing to enter system sleep state S%d\n", + acpi_state); acpi_enter_sleep_state_prep(acpi_state); return 0; } diff --git a/drivers/acpi/sleep/sleep.h b/drivers/acpi/sleep/sleep.h index a2ea125..cfaf8f5 100644 --- a/drivers/acpi/sleep/sleep.h +++ b/drivers/acpi/sleep/sleep.h @@ -5,5 +5,3 @@ extern int acpi_suspend (u32 state); extern void acpi_enable_wakeup_device_prep(u8 sleep_state); extern void acpi_enable_wakeup_device(u8 sleep_state); extern void acpi_disable_wakeup_device(u8 sleep_state); - -extern int acpi_sleep_prepare(u32 acpi_state); -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 35/37] Suspend: Invoke suspend notifications after console switch
From: Johannes Berg <[EMAIL PROTECTED]> In order to fix APM emulation it is necessary to enable apm-emulation notifications for suspends triggered in various ways via the suspend notifiers. However, this will cause the systems using APM emulation to lock up between X being needed to switch away from the VT and X already waiting for resume in the APM ioctl. This patch moves the console switch (if enabled) before the suspend notification (and after the resume notification) to avoid this issue. Signed-off-by: Johannes Berg <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index e47214c..6a6d5eb 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -175,12 +175,12 @@ static int suspend_prepare(void) if (!suspend_ops || !suspend_ops->enter) return -EPERM; + pm_prepare_console(); + error = pm_notifier_call_chain(PM_SUSPEND_PREPARE); if (error) goto Finish; - pm_prepare_console(); - if (suspend_freeze_processes()) { error = -EAGAIN; goto Thaw; @@ -200,9 +200,9 @@ static int suspend_prepare(void) Thaw: suspend_thaw_processes(); - pm_restore_console(); Finish: pm_notifier_call_chain(PM_POST_SUSPEND); + pm_restore_console(); return error; } @@ -309,8 +309,8 @@ int suspend_devices_and_enter(suspend_state_t state) static void suspend_finish(void) { suspend_thaw_processes(); - pm_restore_console(); pm_notifier_call_chain(PM_POST_SUSPEND); + pm_restore_console(); } -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 37/37] PM: Remove obsolete /sys/devices/.../power/state docs
From: David Brownell <[EMAIL PROTECTED]> The /sys/devices/.../power/state files have been gone for a while now, but I just noticed some documentation that still refers to them. (Fortunately described as DEPRECATED and WILL REMOVE). Time to remove that obsolete documentation too ... Signed-off-by: David Brownell <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/devices.txt | 49 --- 1 files changed, 0 insertions(+), 49 deletions(-) diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt index d0e79d5..c53d263 100644 --- a/Documentation/power/devices.txt +++ b/Documentation/power/devices.txt @@ -502,52 +502,3 @@ If the CPU can have a "cpufreq" driver, there also may be opportunities to shift to lower voltage settings and reduce the power cost of executing a given number of instructions. (Without voltage adjustment, it's rare for cpufreq to save much power; the cost-per-instruction must go down.) - - -/sys/devices/.../power/state files -== -For now you can also test some of this functionality using sysfs. - - DEPRECATED: USE "power/state" ONLY FOR DRIVER TESTING, AND - AVOID USING dev->power.power_state IN DRIVERS. - - THESE WILL BE REMOVED. IF THE "power/state" FILE GETS REPLACED, - IT WILL BECOME SOMETHING COUPLED TO THE BUS OR DRIVER. - -In each device's directory, there is a 'power' directory, which contains -at least a 'state' file. The value of this field is effectively boolean, -PM_EVENT_ON or PM_EVENT_SUSPEND. - - * Reading from this file displays a value corresponding to - the power.power_state.event field. All nonzero values are - displayed as "2", corresponding to a low power state; zero - is displayed as "0", corresponding to normal operation. - - * Writing to this file initiates a transition using the - specified event code number; only '0', '2', and '3' are - accepted (without a newline); '2' and '3' are both - mapped to PM_EVENT_SUSPEND. - -On writes, the PM core relies on that recorded event code and the device/bus -capabilities to determine whether it uses a partial suspend() or resume() -sequence to change things so that the recorded event corresponds to the -numeric parameter. - - - If the bus requires the irqs-disabled suspend_late()/resume_early() - phases, writes fail because those operations are not supported here. - - - If the recorded value is the expected value, nothing is done. - - - If the recorded value is nonzero, the device is partially resumed, - using the bus.resume() and/or class.resume() methods. - - - If the target value is nonzero, the device is partially suspended, - using the class.suspend() and/or bus.suspend() methods and the - PM_EVENT_SUSPEND message. - -Drivers have no way to tell whether their suspend() and resume() calls -have come through the sysfs power/state file or as part of entering a -system sleep state, except that when accessed through sysfs the normal -parent/child sequencing rules are ignored. Drivers (such as bus, bridge, -or hub drivers) which expose child devices may need to enforce those rules -on their own. -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 36/37] Hibernation: Invoke suspend notifications after console switch
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Following the recent change in the suspend code path, switch consoles before calling PM notifiers during hibernation. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/disk.c | 17 +++-- 1 files changed, 7 insertions(+), 10 deletions(-) diff --git a/kernel/power/disk.c b/kernel/power/disk.c index 53c22d9..d09da08 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -458,20 +458,13 @@ static void power_down(void) while(1); } -static void unprepare_processes(void) -{ - thaw_processes(); - pm_restore_console(); -} - static int prepare_processes(void) { int error = 0; - pm_prepare_console(); if (freeze_processes()) { error = -EBUSY; - unprepare_processes(); + thaw_processes(); } return error; } @@ -491,6 +484,7 @@ int hibernate(void) goto Unlock; } + pm_prepare_console(); error = pm_notifier_call_chain(PM_HIBERNATION_PREPARE); if (error) goto Exit; @@ -530,11 +524,12 @@ int hibernate(void) swsusp_free(); } Thaw: - unprepare_processes(); + thaw_processes(); Finish: free_basic_memory_bitmaps(); Exit: pm_notifier_call_chain(PM_POST_HIBERNATION); + pm_restore_console(); atomic_inc(&snapshot_device_available); Unlock: mutex_unlock(&pm_mutex); @@ -603,6 +598,7 @@ static int software_resume(void) goto Unlock; } + pm_prepare_console(); error = pm_notifier_call_chain(PM_RESTORE_PREPARE); if (error) goto Finish; @@ -626,11 +622,12 @@ static int software_resume(void) printk(KERN_ERR "PM: Restore failed, recovering.\n"); swsusp_free(); - unprepare_processes(); + thaw_processes(); Done: free_basic_memory_bitmaps(); Finish: pm_notifier_call_chain(PM_POST_RESTORE); + pm_restore_console(); atomic_inc(&snapshot_device_available); /* For success case, the suspend path will release the lock */ Unlock: -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 17/37] Suspend: Fix comment in main.c
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Fix a comment in kernel/power/main.c so that it doesn't contain lines longer that 80 characters. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index 7fb805d..d0cedaa 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -250,8 +250,8 @@ static int suspend_enter(suspend_state_t state) } /** - * suspend_devices_and_enter - suspend devices and enter the desired system sleep - * state. + * suspend_devices_and_enter - suspend devices and enter the desired system + * sleep state. * @state: state to enter */ int suspend_devices_and_enter(suspend_state_t state) -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 20/37] Suspend: Use common prefix in messages
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Make suspend messages start with one common prefix "PM: ". Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index d0cedaa..5688168 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -235,7 +235,7 @@ static int suspend_enter(suspend_state_t state) BUG_ON(!irqs_disabled()); if ((error = device_power_down(PMSG_SUSPEND))) { - printk(KERN_ERR "Some devices failed to power down\n"); + printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } @@ -269,7 +269,7 @@ int suspend_devices_and_enter(suspend_state_t state) suspend_console(); error = device_suspend(PMSG_SUSPEND); if (error) { - printk(KERN_ERR "Some devices failed to suspend\n"); + printk(KERN_ERR "PM: Some devices failed to suspend\n"); goto Resume_console; } @@ -352,7 +352,7 @@ static int enter_state(suspend_state_t state) if (!mutex_trylock(&pm_mutex)) return -EBUSY; - printk("Syncing filesystems ... "); + printk(KERN_INFO "PM: Syncing filesystems ... "); sys_sync(); printk("done.\n"); -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 29/37] ACPI suspend: Call _PTS before suspending devices
From: Rafael J. Wysocki <[EMAIL PROTECTED]> The ACPI 1.0 specification wants us to put devices into low power states after executing the _PTS global control method, while ACPI 2.0 and later want us to do that in the reverse order. The current suspend code follows ACPI 2.0 in that respect which causes some ACPI 1.0x systems to hang during suspend (ref. http://bugzilla.kernel.org/show_bug.cgi?id=9528). Make the suspend code execute _PTS before putting devices into low power states (ie. in accordance with ACPI 1.0x) and provide a command line option to override the default if need be. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/kernel-parameters.txt |5 +++ drivers/acpi/sleep/main.c | 51 ++- 2 files changed, 43 insertions(+), 13 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 92c40d1..cf38689 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -168,6 +168,11 @@ and is between 256 and 4096 characters. It is defined in the file acpi_irq_isa= [HW,ACPI] If irq_balance, mark listed IRQs used by ISA Format: ,... + acpi_new_pts_ordering [HW,ACPI] + Enforce the ACPI 2.0 ordering of the _PTS control + method wrt putting devices into low power states + default: pre ACPI 2.0 ordering of _PTS + acpi_no_auto_ssdt [HW,ACPI] Disable automatic loading of SSDT acpi_os_name= [HW,ACPI] Tell ACPI BIOS the name of the OS diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index 198ff8a..c37c4ea 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -26,6 +26,21 @@ u8 sleep_states[ACPI_S_STATE_COUNT]; #ifdef CONFIG_PM_SLEEP static u32 acpi_target_sleep_state = ACPI_STATE_S0; +static bool acpi_sleep_finish_wake_up; + +/* + * ACPI 2.0 and later want us to execute _PTS after suspending devices, so we + * allow the user to request that behavior by using the 'acpi_new_pts_ordering' + * kernel command line option that causes the following variable to be set. + */ +static bool new_pts_ordering; + +static int __init acpi_new_pts_ordering(char *str) +{ + new_pts_ordering = true; + return 1; +} +__setup("acpi_new_pts_ordering", acpi_new_pts_ordering); #endif int acpi_sleep_prepare(u32 acpi_state) @@ -74,6 +89,14 @@ static int acpi_pm_begin(suspend_state_t pm_state) if (sleep_states[acpi_state]) { acpi_target_sleep_state = acpi_state; + if (new_pts_ordering) + return 0; + + error = acpi_sleep_prepare(acpi_state); + if (error) + acpi_target_sleep_state = ACPI_STATE_S0; + else + acpi_sleep_finish_wake_up = true; } else { printk(KERN_ERR "ACPI does not support this state: %d\n", pm_state); @@ -91,15 +114,17 @@ static int acpi_pm_begin(suspend_state_t pm_state) static int acpi_pm_prepare(void) { - int error; + if (new_pts_ordering) { + int error = acpi_sleep_prepare(acpi_target_sleep_state); - error = acpi_sleep_prepare(acpi_target_sleep_state); - if (error) - acpi_target_sleep_state = ACPI_STATE_S0; - else if (!ACPI_SUCCESS(acpi_hw_disable_all_gpes())) - error = -EFAULT; + if (error) { + acpi_target_sleep_state = ACPI_STATE_S0; + return error; + } + acpi_sleep_finish_wake_up = true; + } - return error; + return ACPI_SUCCESS(acpi_hw_disable_all_gpes()) ? 0 : -EFAULT; } /** @@ -123,10 +148,8 @@ static int acpi_pm_enter(suspend_state_t pm_state) if (acpi_state == ACPI_STATE_S3) { int error = acpi_save_state_mem(); - if (error) { - acpi_target_sleep_state = ACPI_STATE_S0; + if (error) return error; - } } local_irq_save(flags); @@ -187,6 +210,7 @@ static void acpi_pm_finish(void) acpi_set_firmware_waking_vector((acpi_physical_address) 0); acpi_target_sleep_state = ACPI_STATE_S0; + acpi_sleep_finish_wake_up = false; #ifdef CONFIG_X86 if (init_8259A_after_S1) { @@ -203,10 +227,11 @@ static void acpi_pm_finish(void) static void acpi_pm_end(void) { /* -* This is necessary in case acpi_pm_finish() is not called during a -* failing transition to a sleep state. +* This is necessary in case acpi_pm_finish() is not called directly +* during a failing transition to a sleep state. */ - acp
[PATCH 33/37] Suspend: Add config option to disable the freezer if architecture wants that
From: Johannes Berg <[EMAIL PROTECTED]> This patch makes the freezer optional for suspend to allow the system to work (or not work) like the original PMU suspend. Signed-off-by: Johannes Berg <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- arch/powerpc/Kconfig |4 kernel/power/Kconfig | 11 +++ kernel/power/main.c |6 +++--- kernel/power/power.h | 22 ++ 4 files changed, 40 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 824140d..4a22c99 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -405,6 +405,10 @@ config CMDLINE most cases you will need to specify the root device here. if !44x || BROKEN +config ARCH_WANTS_FREEZER_CONTROL + def_bool y + depends on ADB_PMU + source kernel/power/Kconfig endif diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig index f8153fd..ef9b802 100644 --- a/kernel/power/Kconfig +++ b/kernel/power/Kconfig @@ -104,6 +104,17 @@ config SUSPEND powered and thus its contents are preserved, such as the suspend-to-RAM state (e.g. the ACPI S3 state). +config SUSPEND_FREEZER + bool "Enable freezer for suspend to RAM/standby" \ + if ARCH_WANTS_FREEZER_CONTROL || BROKEN + depends on SUSPEND + default y + help + This allows you to turn off the freezer for suspend. If this is + done, no tasks are frozen for suspend to RAM/standby. + + Turning OFF this setting is NOT recommended! If in doubt, say Y. + config HIBERNATION bool "Hibernation (aka 'suspend to disk')" depends on PM && SWAP && ARCH_HIBERNATION_POSSIBLE diff --git a/kernel/power/main.c b/kernel/power/main.c index d9bba45..e47214c 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -181,7 +181,7 @@ static int suspend_prepare(void) pm_prepare_console(); - if (freeze_processes()) { + if (suspend_freeze_processes()) { error = -EAGAIN; goto Thaw; } @@ -199,7 +199,7 @@ static int suspend_prepare(void) return 0; Thaw: - thaw_processes(); + suspend_thaw_processes(); pm_restore_console(); Finish: pm_notifier_call_chain(PM_POST_SUSPEND); @@ -308,7 +308,7 @@ int suspend_devices_and_enter(suspend_state_t state) */ static void suspend_finish(void) { - thaw_processes(); + suspend_thaw_processes(); pm_restore_console(); pm_notifier_call_chain(PM_POST_SUSPEND); } diff --git a/kernel/power/power.h b/kernel/power/power.h index 8ec5499..700f44e 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -1,6 +1,7 @@ #include #include #include +#include struct swsusp_info { struct new_utsname uts; @@ -203,3 +204,24 @@ enum { #define TEST_MAX (__TEST_AFTER_LAST - 1) extern int pm_test_level; + +#ifdef CONFIG_SUSPEND_FREEZER +static inline int suspend_freeze_processes(void) +{ + return freeze_processes(); +} + +static inline void suspend_thaw_processes(void) +{ + thaw_processes(); +} +#else +static inline int suspend_freeze_processes(void) +{ + return 0; +} + +static inline void suspend_thaw_processes(void) +{ +} +#endif -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 34/37] Suspend: Clean up suspend_64.c
From: Borislav Petkov <[EMAIL PROTECTED]> There's a freakishly long comment in suspend_64.c, shorten it. Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- arch/x86/kernel/suspend_64.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/suspend_64.c b/arch/x86/kernel/suspend_64.c index 0919951..7ac7130 100644 --- a/arch/x86/kernel/suspend_64.c +++ b/arch/x86/kernel/suspend_64.c @@ -140,7 +140,12 @@ static void fix_processor_context(void) int cpu = smp_processor_id(); struct tss_struct *t = &per_cpu(init_tss, cpu); - set_tss_desc(cpu,t);/* This just modifies memory; should not be necessary. But... This is necessary, because 386 hardware has concept of busy TSS or some similar stupidity. */ + /* +* This just modifies memory; should not be necessary. But... This +* is necessary, because 386 hardware has concept of busy TSS or some +* similar stupidity. +*/ + set_tss_desc(cpu, t); get_cpu_gdt_table(cpu)[GDT_ENTRY_TSS].type = 9; @@ -160,7 +165,6 @@ static void fix_processor_context(void) loaddebug(¤t->thread, 6); loaddebug(¤t->thread, 7); } - } #ifdef CONFIG_HIBERNATION -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 27/37] ACPI: Separate invocations of _GTS and _BFS from _PTS and _WAK
From: Rafael J. Wysocki <[EMAIL PROTECTED]> The execution of ACPI global control methods _GTS and _BFS is currently tied to the preparation to enter a sleep state and to the leaving of the sleep state, respectively. However, these functions are called before disabling the nonboot CPUs and after enabling them, respectively (in fact, on ACPI 1.0x systems the first of them ought to be called before suspending devices), while according to the ACPI specification, _GTS is to be executed right prior to entering the system sleep state and _BFS is to be executed right after the platfor firmware has returned control to the OS on wake up. Move the execution of _GTS and _BFS to the right places. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- drivers/acpi/hardware/hwsleep.c | 75 +- drivers/acpi/sleep/main.c |7 include/acpi/acpixf.h |2 + 3 files changed, 66 insertions(+), 18 deletions(-) diff --git a/drivers/acpi/hardware/hwsleep.c b/drivers/acpi/hardware/hwsleep.c index 81b2484..13c93a1 100644 --- a/drivers/acpi/hardware/hwsleep.c +++ b/drivers/acpi/hardware/hwsleep.c @@ -192,18 +192,13 @@ acpi_status acpi_enter_sleep_state_prep(u8 sleep_state) arg.type = ACPI_TYPE_INTEGER; arg.integer.value = sleep_state; - /* Run the _PTS and _GTS methods */ + /* Run the _PTS method */ status = acpi_evaluate_object(NULL, METHOD_NAME__PTS, &arg_list, NULL); if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) { return_ACPI_STATUS(status); } - status = acpi_evaluate_object(NULL, METHOD_NAME__GTS, &arg_list, NULL); - if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) { - return_ACPI_STATUS(status); - } - /* Setup the argument to _SST */ switch (sleep_state) { @@ -262,6 +257,8 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state) struct acpi_bit_register_info *sleep_type_reg_info; struct acpi_bit_register_info *sleep_enable_reg_info; u32 in_value; + struct acpi_object_list arg_list; + union acpi_object arg; acpi_status status; ACPI_FUNCTION_TRACE(acpi_enter_sleep_state); @@ -307,6 +304,18 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state) return_ACPI_STATUS(status); } + /* Execute the _GTS method */ + + arg_list.count = 1; + arg_list.pointer = &arg; + arg.type = ACPI_TYPE_INTEGER; + arg.integer.value = sleep_state; + + status = acpi_evaluate_object(NULL, METHOD_NAME__GTS, &arg_list, NULL); + if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) { + return_ACPI_STATUS(status); + } + /* Get current value of PM1A control */ status = acpi_hw_register_read(ACPI_REGISTER_PM1_CONTROL, &PM1Acontrol); @@ -473,17 +482,18 @@ ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state_s4bios) /*** * - * FUNCTION:acpi_leave_sleep_state + * FUNCTION:acpi_leave_sleep_state_prep * - * PARAMETERS: sleep_state - Which sleep state we just exited + * PARAMETERS: sleep_state - Which sleep state we are exiting * * RETURN: Status * - * DESCRIPTION: Perform OS-independent ACPI cleanup after a sleep - * Called with interrupts ENABLED. + * DESCRIPTION: Perform the first state of OS-independent ACPI cleanup after a + * sleep. + * Called with interrupts DISABLED. * **/ -acpi_status acpi_leave_sleep_state(u8 sleep_state) +acpi_status acpi_leave_sleep_state_prep(u8 sleep_state) { struct acpi_object_list arg_list; union acpi_object arg; @@ -493,7 +503,7 @@ acpi_status acpi_leave_sleep_state(u8 sleep_state) u32 PM1Acontrol; u32 PM1Bcontrol; - ACPI_FUNCTION_TRACE(acpi_leave_sleep_state); + ACPI_FUNCTION_TRACE(acpi_leave_sleep_state_prep); /* * Set SLP_TYPE and SLP_EN to state S0. @@ -540,6 +550,41 @@ acpi_status acpi_leave_sleep_state(u8 sleep_state) } } + /* Execute the _BFS method */ + + arg_list.count = 1; + arg_list.pointer = &arg; + arg.type = ACPI_TYPE_INTEGER; + arg.integer.value = sleep_state; + + status = acpi_evaluate_object(NULL, METHOD_NAME__BFS, &arg_list, NULL); + if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) { + ACPI_EXCEPTION((AE_INFO, status, "During Method _BFS")); + } + + return_ACPI_STATUS(status); +} + +/*** + * + * FUNCTION:
[PATCH 15/37] Suspend: Fix compilation warning for CONFIG_SUSPEND unset
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Suspend: Make debug facility depend on CONFIG_SUSPEND Make the new suspend debug facility code depend on CONFIG_SUSPEND, as appropriate, to remove the compiler warning printed when CONFIG_PM is set and CONFIG_SUSPEND is not set. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/main.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kernel/power/main.c b/kernel/power/main.c index 0a9f269..7fb805d 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -55,6 +55,8 @@ int pm_notifier_call_chain(unsigned long val) #endif /* CONFIG_PM_SLEEP */ +#ifdef CONFIG_SUSPEND + #ifdef CONFIG_PM_DEBUG int pm_test_level = TEST_NONE; -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/37] Hibernation: Move low level resume to disk.c
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Move the low level restore code to kernel/power/disk.c , since the corresponding low level hibernation code is already there. Make restore fail if device_power_down(PMSG_PRETHAW) returns an error. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/disk.c | 49 - kernel/power/power.h |1 - kernel/power/swsusp.c | 35 --- 3 files changed, 48 insertions(+), 37 deletions(-) diff --git a/kernel/power/disk.c b/kernel/power/disk.c index 0866b16..2a4bada 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -275,6 +275,53 @@ int hibernation_snapshot(int platform_mode) } /** + * resume_target_kernel - prepare devices that need to be suspended with + * interrupts off, restore the contents of highmem that have not been + * restored yet from the image and run the low level code that will restore + * the remaining contents of memory and switch to the just restored target + * kernel. + */ + +static int resume_target_kernel(void) +{ + int error; + + local_irq_disable(); + error = device_power_down(PMSG_PRETHAW); + if (error) { + printk(KERN_ERR "Some devices failed to power down, " + "aborting resume\n"); + goto Enable_irqs; + } + /* We'll ignore saved state, but this gets preempt count (etc) right */ + save_processor_state(); + error = restore_highmem(); + if (!error) { + error = swsusp_arch_resume(); + /* +* The code below is only ever reached in case of a failure. +* Otherwise execution continues at place where +* swsusp_arch_suspend() was called +*/ + BUG_ON(!error); + /* This call to restore_highmem() undos the previous one */ + restore_highmem(); + } + /* +* The only reason why swsusp_arch_resume() can fail is memory being +* very tight, so we have to free it as soon as we can to avoid +* subsequent failures +*/ + swsusp_free(); + restore_processor_state(); + touch_softlockup_watchdog(); + device_power_up(); + Enable_irqs: + local_irq_enable(); + return error; +} + +/** * hibernation_restore - quiesce devices and restore the hibernation * snapshot image. If successful, control returns in hibernation_snaphot() * @platform_mode - if set, use the platform driver, if available, to @@ -297,7 +344,7 @@ int hibernation_restore(int platform_mode) if (!error) { error = disable_nonboot_cpus(); if (!error) - error = swsusp_resume(); + error = resume_target_kernel(); enable_nonboot_cpus(); } platform_restore_cleanup(platform_mode); diff --git a/kernel/power/power.h b/kernel/power/power.h index a9732fd..8ec5499 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -152,7 +152,6 @@ extern int swsusp_swap_in_use(void); extern int swsusp_check(void); extern int swsusp_shrink_memory(void); extern void swsusp_free(void); -extern int swsusp_resume(void); extern int swsusp_read(unsigned int *flags_p); extern int swsusp_write(unsigned int flags); extern void swsusp_close(void); diff --git a/kernel/power/swsusp.c b/kernel/power/swsusp.c index 605c536..dc29a20 100644 --- a/kernel/power/swsusp.c +++ b/kernel/power/swsusp.c @@ -261,38 +261,3 @@ int swsusp_shrink_memory(void) return 0; } - -int swsusp_resume(void) -{ - int error; - - local_irq_disable(); - /* NOTE: device_power_down() is just a suspend() with irqs off; -* it has no special "power things down" semantics -*/ - if (device_power_down(PMSG_PRETHAW)) - printk(KERN_ERR "Some devices failed to power down, very bad\n"); - /* We'll ignore saved state, but this gets preempt count (etc) right */ - save_processor_state(); - error = restore_highmem(); - if (!error) { - error = swsusp_arch_resume(); - /* The code below is only ever reached in case of a failure. -* Otherwise execution continues at place where -* swsusp_arch_suspend() was called -*/ - BUG_ON(!error); - /* This call to restore_highmem() undos the previous one */ - restore_highmem(); - } - /* The only reason why swsusp_arch_resume() can fail is memory being -* very tight, so we have to free it as soon as we can to avoid -* subsequent failures -*/ - swsusp_fr
[PATCH 26/37] Suspend: Introduce begin() and end() callbacks
From: Rafael J. Wysocki <[EMAIL PROTECTED]> On ACPI systems the target state set by acpi_pm_set_target() is reset by acpi_pm_finish(), but that need not be called if the suspend fails. Â All platforms that use the .set_target() global suspend callback are affected by analogous issues. For this reason, we need an additional global suspend callback that will reset the target state regardless of whether or not the suspend is successful. Â Also, it is reasonable to rename the .set_target() callback, since it will be used for a different purpose on ACPI systems (due to ACPI 1.0x code ordering requirements). Introduce the global suspend callback .end() to be executed at the end of the suspend sequence and rename the .set_target() global suspend callback to .begin(). Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- arch/arm/mach-at91/pm.c | 17 + arch/powerpc/platforms/52xx/lite5200_pm.c | 10 -- drivers/acpi/sleep/main.c | 22 ++ include/linux/suspend.h | 29 ++--- kernel/power/main.c |9 ++--- 5 files changed, 63 insertions(+), 24 deletions(-) diff --git a/arch/arm/mach-at91/pm.c b/arch/arm/mach-at91/pm.c index 4b120cc..a67defd 100644 --- a/arch/arm/mach-at91/pm.c +++ b/arch/arm/mach-at91/pm.c @@ -52,7 +52,7 @@ static suspend_state_t target_state; /* * Called after processes are frozen, but before we shutdown devices. */ -static int at91_pm_set_target(suspend_state_t state) +static int at91_pm_begin(suspend_state_t state) { target_state = state; return 0; @@ -202,11 +202,20 @@ error: return 0; } +/* + * Called right prior to thawing processes. + */ +static void at91_pm_end(void) +{ + target_state = PM_SUSPEND_ON; +} + static struct platform_suspend_ops at91_pm_ops ={ - .valid = at91_pm_valid_state, - .set_target = at91_pm_set_target, - .enter = at91_pm_enter, + .valid = at91_pm_valid_state, + .begin = at91_pm_begin, + .enter = at91_pm_enter, + .end= at91_pm_end, }; static int __init at91_pm_init(void) diff --git a/arch/powerpc/platforms/52xx/lite5200_pm.c b/arch/powerpc/platforms/52xx/lite5200_pm.c index c0f13e8..41c7fd9 100644 --- a/arch/powerpc/platforms/52xx/lite5200_pm.c +++ b/arch/powerpc/platforms/52xx/lite5200_pm.c @@ -31,7 +31,7 @@ static int lite5200_pm_valid(suspend_state_t state) } } -static int lite5200_pm_set_target(suspend_state_t state) +static int lite5200_pm_begin(suspend_state_t state) { if (lite5200_pm_valid(state)) { lite5200_pm_target_state = state; @@ -219,12 +219,18 @@ static void lite5200_pm_finish(void) mpc52xx_pm_finish(); } +static void lite5200_pm_end(void) +{ + lite5200_pm_target_state = PM_SUSPEND_ON; +} + static struct platform_suspend_ops lite5200_pm_ops = { .valid = lite5200_pm_valid, - .set_target = lite5200_pm_set_target, + .begin = lite5200_pm_begin, .prepare= lite5200_pm_prepare, .enter = lite5200_pm_enter, .finish = lite5200_pm_finish, + .end= lite5200_pm_end, }; int __init lite5200_pm_init(void) diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index 96d23b3..e2e4e61 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -63,11 +63,11 @@ static u32 acpi_suspend_states[] = { static int init_8259A_after_S1; /** - * acpi_pm_set_target - Set the target system sleep state to the state + * acpi_pm_begin - Set the target system sleep state to the state * associated with given @pm_state, if supported. */ -static int acpi_pm_set_target(suspend_state_t pm_state) +static int acpi_pm_begin(suspend_state_t pm_state) { u32 acpi_state = acpi_suspend_states[pm_state]; int error = 0; @@ -164,7 +164,7 @@ static int acpi_pm_enter(suspend_state_t pm_state) } /** - * acpi_pm_finish - Finish up suspend sequence. + * acpi_pm_finish - Instruct the platform to leave a sleep state. * * This is called after we wake back up (or if entering the sleep state * failed). @@ -190,6 +190,19 @@ static void acpi_pm_finish(void) #endif } +/** + * acpi_pm_end - Finish up suspend sequence. + */ + +static void acpi_pm_end(void) +{ + /* +* This is necessary in case acpi_pm_finish() is not called during a +* failing transition to a sleep state. +*/ + acpi_target_sleep_state = ACPI_STATE_S0; +} + static int acpi_pm_state_valid(suspend_state_t pm_state) { u32 acpi_state; @@ -208,10 +221,11 @@ static int acpi_pm_state_valid(suspend_state_t pm_state) static struct platform_suspend_ops acpi_pm_ops = { .valid = a
[PATCH 18/37] Hibernation: Fix comment in disk.c
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Fix a comment in kernel/power/disk.c so that it doesn't contain lines longer that 80 characters. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- kernel/power/disk.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/power/disk.c b/kernel/power/disk.c index 2a4bada..3e24a20 100644 --- a/kernel/power/disk.c +++ b/kernel/power/disk.c @@ -568,8 +568,8 @@ static int software_resume(void) if (noresume) { /** -* FIXME: If noresume is specified, we need to find the partition -* and reset it back to normal swap space. +* FIXME: If noresume is specified, we need to find the +* partition and reset it back to normal swap space. */ mutex_unlock(&pm_mutex); return 0; -- 1.5.4.rc5.16.gc0279 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/37] Hibernation: Introduce exportable suspend ioctls header (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Move the definitions of hibernation ioctls to a separate header file in include/linux, which can be exported to the user space. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/userland-swsusp.txt | 10 include/linux/Kbuild|1 + include/linux/suspend_ioctls.h | 32 +++ kernel/power/power.h| 29 +--- 4 files changed, 39 insertions(+), 33 deletions(-) create mode 100644 include/linux/suspend_ioctls.h diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index af52d53..7b99636 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -14,7 +14,7 @@ are going to develop your own suspend/resume utilities. The interface consists of a character device providing the open(), release(), read(), and write() operations as well as several ioctl() -commands defined in kernel/power/power.h. The major and minor +commands defined in include/linux/suspend_ioctls.h . The major and minor numbers of the device are, respectively, 10 and 231, and they can be read from /sys/class/misc/snapshot/dev. @@ -70,10 +70,10 @@ SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated by SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in units) from the beginning of the partition at which the swap header is located (the last ioctl() argument should point to a struct - resume_swap_area, as defined in kernel/power/power.h, containing the - resume device specification and the offset); for swap partitions the - offset is always 0, but it is different from zero for swap files (see - Documentation/swsusp-and-swap-files.txt for details). + resume_swap_area, as defined in kernel/power/suspend_ioctls.h, + containing the resume device specification and the offset); for swap + partitions the offset is always 0, but it is different from zero for + swap files (see Documentation/swsusp-and-swap-files.txt for details). SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support, depending on the argument value (enable, if the argument is nonzero) diff --git a/include/linux/Kbuild b/include/linux/Kbuild index 85b2482..c0f9bb7 100644 --- a/include/linux/Kbuild +++ b/include/linux/Kbuild @@ -143,6 +143,7 @@ header-y += snmp.h header-y += sockios.h header-y += som.h header-y += sound.h +header-y += suspend_ioctls.h header-y += taskstats.h header-y += telephony.h header-y += termios.h diff --git a/include/linux/suspend_ioctls.h b/include/linux/suspend_ioctls.h new file mode 100644 index 000..2c6faec --- /dev/null +++ b/include/linux/suspend_ioctls.h @@ -0,0 +1,32 @@ +#ifndef _LINUX_SUSPEND_IOCTLS_H +#define _LINUX_SUSPEND_IOCTLS_H + +/* + * This structure is used to pass the values needed for the identification + * of the resume swap area from a user space to the kernel via the + * SNAPSHOT_SET_SWAP_AREA ioctl + */ +struct resume_swap_area { + loff_t offset; + u_int32_t dev; +} __attribute__((packed)); + +#define SNAPSHOT_IOC_MAGIC '3' +#define SNAPSHOT_FREEZE_IO(SNAPSHOT_IOC_MAGIC, 1) +#define SNAPSHOT_UNFREEZE _IO(SNAPSHOT_IOC_MAGIC, 2) +#define SNAPSHOT_ATOMIC_RESTORE_IO(SNAPSHOT_IOC_MAGIC, 4) +#define SNAPSHOT_FREE _IO(SNAPSHOT_IOC_MAGIC, 5) +#define SNAPSHOT_FREE_SWAP_PAGES _IO(SNAPSHOT_IOC_MAGIC, 9) +#define SNAPSHOT_S2RAM _IO(SNAPSHOT_IOC_MAGIC, 11) +#define SNAPSHOT_SET_SWAP_AREA _IOW(SNAPSHOT_IOC_MAGIC, 13, \ + struct resume_swap_area) +#define SNAPSHOT_GET_IMAGE_SIZE_IOR(SNAPSHOT_IOC_MAGIC, 14, loff_t) +#define SNAPSHOT_PLATFORM_SUPPORT _IO(SNAPSHOT_IOC_MAGIC, 15) +#define SNAPSHOT_POWER_OFF _IO(SNAPSHOT_IOC_MAGIC, 16) +#define SNAPSHOT_CREATE_IMAGE _IOW(SNAPSHOT_IOC_MAGIC, 17, int) +#define SNAPSHOT_PREF_IMAGE_SIZE _IO(SNAPSHOT_IOC_MAGIC, 18) +#define SNAPSHOT_AVAIL_SWAP_SIZE _IOR(SNAPSHOT_IOC_MAGIC, 19, loff_t) +#define SNAPSHOT_ALLOC_SWAP_PAGE _IOR(SNAPSHOT_IOC_MAGIC, 20, loff_t) +#define SNAPSHOT_IOC_MAXNR 20 + +#endif /* _LINUX_SUSPEND_IOCTLS_H */ diff --git a/kernel/power/power.h b/kernel/power/power.h index 0dd66fa..ef90605 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -1,4 +1,5 @@ #include +#include #include struct swsusp_info { @@ -134,34 +135,6 @@ extern int snapshot_write_next(struct snapshot_handle *handle, size_t count); extern void snapshot_write_finalize(struct snapshot_handle *handle); extern int snapshot_image_loaded(struct snapshot_handle *ha
[PATCH 04/37] Hibernation: Correct definitions of some ioctls (rev. 2)
From: Rafael J. Wysocki <[EMAIL PROTECTED]> Three ioctl numbers belonging to the hibernation userland interface, SNAPSHOT_ATOMIC_SNAPSHOT, SNAPSHOT_AVAIL_SWAP, SNAPSHOT_GET_SWAP_PAGE, are defined in a wrong way (eg. not portable). Provide new ioctl numbers for these ioctls and mark the existing ones as deprecated. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> Acked-by: Pavel Machek <[EMAIL PROTECTED]> Signed-off-by: Len Brown <[EMAIL PROTECTED]> --- Documentation/power/userland-swsusp.txt | 26 +- kernel/power/power.h| 10 +- kernel/power/user.c | 18 -- 3 files changed, 34 insertions(+), 20 deletions(-) diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index 0785500..af52d53 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -27,17 +27,17 @@ once at a time. The ioctl() commands recognized by the device are: SNAPSHOT_FREEZE - freeze user space processes (the current process is - not frozen); this is required for SNAPSHOT_ATOMIC_SNAPSHOT + not frozen); this is required for SNAPSHOT_CREATE_IMAGE and SNAPSHOT_ATOMIC_RESTORE to succeed SNAPSHOT_UNFREEZE - thaw user space processes frozen by SNAPSHOT_FREEZE -SNAPSHOT_ATOMIC_SNAPSHOT - create a snapshot of the system memory; the +SNAPSHOT_CREATE_IMAGE - create a snapshot of the system memory; the last argument of ioctl() should be a pointer to an int variable, the value of which will indicate whether the call returned after creating the snapshot (1) or after restoring the system memory state from it (0) (after resume the system finds itself finishing the - SNAPSHOT_ATOMIC_SNAPSHOT ioctl() again); after the snapshot + SNAPSHOT_CREATE_IMAGE ioctl() again); after the snapshot has been created the read() operation can be used to transfer it out of the kernel @@ -49,23 +49,23 @@ SNAPSHOT_ATOMIC_RESTORE - restore the system memory state from the SNAPSHOT_FREE - free memory allocated for the snapshot image -SNAPSHOT_SET_IMAGE_SIZE - set the preferred maximum size of the image +SNAPSHOT_PREF_IMAGE_SIZE - set the preferred maximum size of the image (the kernel will do its best to ensure the image size will not exceed this number, but if it turns out to be impossible, the kernel will create the smallest image possible) SNAPSHOT_GET_IMAGE_SIZE - return the actual size of the hibernation image -SNAPSHOT_AVAIL_SWAP - return the amount of available swap in bytes (the last - argument should be a pointer to an unsigned int variable that will +SNAPSHOT_AVAIL_SWAP_SIZE - return the amount of available swap in bytes (the + last argument should be a pointer to an unsigned int variable that will contain the result if the call is successful). -SNAPSHOT_GET_SWAP_PAGE - allocate a swap page from the resume partition +SNAPSHOT_ALLOC_SWAP_PAGE - allocate a swap page from the resume partition (the last argument should be a pointer to a loff_t variable that will contain the swap page offset if the call is successful) -SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated with - SNAPSHOT_GET_SWAP_PAGE +SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated by + SNAPSHOT_ALLOC_SWAP_PAGE SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in units) from the beginning of the partition at which the swap header is @@ -102,7 +102,7 @@ The device's write() operation is used for uploading the system memory snapshot into the kernel. It has the same limitations as the read() operation. The release() operation frees all memory allocated for the snapshot image -and all swap pages allocated with SNAPSHOT_GET_SWAP_PAGE (if any). +and all swap pages allocated with SNAPSHOT_ALLOC_SWAP_PAGE (if any). Thus it is not necessary to use either SNAPSHOT_FREE or SNAPSHOT_FREE_SWAP_PAGES before closing the device (in fact it will also unfreeze user space processes frozen by SNAPSHOT_UNFREEZE if they are @@ -113,7 +113,7 @@ snapshot image from/to the kernel will use a swap parition, called the resume partition, or a swap file as storage space (if a swap file is used, the resume partition is the partition that holds this file). However, this is not really required, as they can use, for example, a special (blank) suspend partition or -a file on a partition that is unmounted before SNAPSHOT_ATOMIC_SNAPSHOT and +a file on a partition that is unmounted before SNAPSHOT_CREATE_IMAGE and mounted afterwards. These utilities MUST NOT make any assumptions regarding the ordering of @@ -135,7 +135,7 @@ means, such as checksums, to ensure the integrity of the snapshot image. The suspending and resuming utilities MUST lock themselves in memory, preferrably using mlockall(), b
Re: kernel panic on 2.6.24/iTCO_wdt not rebooting machine
On Friday 01 February 2008 10:12, Denys Fedoryshchenko wrote: > Hi > > I sent already report to netdev, but most interesting question i have, that > machine is not rebooted (it was set over sysctl value to kernel.panic) and > watchdog didnt reboot it too. > > I set: > > kernel.panic = 10 > kernel.panic_on_oops = 10 > > watchdog iTCO_wdt + watchdog from busybox, and still machine didn't came back > online from panic! But after pressing reset button by guy on location (it is > very far in mountains, roads is blocked by snow now, there is no keyboard/ > screen even to check what's happening). > > After testing i notice that iTCO_wdt not working on this motherboard. > > in dmesg > Feb 1 19:34:17 10.184.184.1 kernel: [ 58.112496] iTCO_wdt: Intel TCO > WatchDog Timer Driver v1.02 (26-Jul-2007) > Feb 1 19:34:17 10.184.184.1 kernel: [ 58.113114] iTCO_wdt: Found a ICH9R > TCO device (Version=2, TCOBASE=0x0460) > Feb 1 19:34:17 10.184.184.1 kernel: [ 58.113654] iTCO_wdt: initialized. > heartbeat=30 sec (nowayout=0) > > 1)i launch busybox watchdog: > watchdog -t 5 /dev/watchdog > i can see it in processes > > 2)then i do > killall -9 watchdog > i can see in dmesg > Feb 2 00:55:23 10.184.184.1 kernel: [ 6400.419418] iTCO_wdt: Unexpected > close, not stopping watchdog! > > Machine is not rebooting. It is not rebooting also on panic (over sysctl > value). Motherboard: Intel DP35DP > > Here is panic message, just for information. > ... > Feb 1 09:08:50 SERVER [12380.067806] Call Trace: > Feb 1 09:08:50 SERVER [12380.067839] [] > Feb 1 09:08:50 SERVER __remove_hrtimer+0x5d/0x64 > Feb 1 09:08:50 SERVER [12380.067861] [] > Feb 1 09:08:50 SERVER hrtimer_interrupt+0x10c/0x19a > Feb 1 09:08:50 SERVER [12380.067883] [] > Feb 1 09:08:50 SERVER smp_apic_timer_interrupt+0x6f/0x80 > Feb 1 09:08:50 SERVER [12380.067905] [] > Feb 1 09:08:50 SERVER apic_timer_interrupt+0x28/0x30 > Feb 1 09:08:50 SERVER [12380.067928] [] > Feb 1 09:08:50 SERVER _spin_lock_irqsave+0x13/0x27 > Feb 1 09:08:50 SERVER [12380.067949] [] > Feb 1 09:08:50 SERVER lock_hrtimer_base+0x15/0x2f > Feb 1 09:08:50 SERVER [12380.067970] [] > Feb 1 09:08:50 SERVER hrtimer_start+0x16/0xf4 > Feb 1 09:08:50 SERVER [12380.067991] [] > Feb 1 09:08:50 SERVER qdisc_watchdog_schedule+0x1e/0x21 > Feb 1 09:08:50 SERVER [12380.068013] [] > Feb 1 09:08:50 SERVER htb_dequeue+0x6ef/0x6fb [sch_htb] > Feb 1 09:08:50 SERVER [12380.068036] [] > Feb 1 09:08:50 SERVER ip_rcv+0x1fc/0x237 > Feb 1 09:08:50 SERVER [12380.068057] [] > Feb 1 09:08:50 SERVER hrtimer_get_next_event+0xae/0xbb > Feb 1 09:08:50 SERVER [12380.068078] [] > Feb 1 09:08:50 SERVER hrtimer_get_next_event+0xae/0xbb > Feb 1 09:08:50 SERVER [12380.068099] [] > Feb 1 09:08:50 SERVER getnstimeofday+0x2b/0xb5 > Feb 1 09:08:50 SERVER [12380.068118] [] > Feb 1 09:08:50 SERVER clockevents_program_event+0xe0/0xee > Feb 1 09:08:50 SERVER [12380.068140] [] > Feb 1 09:08:50 SERVER __qdisc_run+0x2a/0x163 > Feb 1 09:08:50 SERVER [12380.068161] [] > Feb 1 09:08:50 SERVER net_tx_action+0xa8/0xcc > Feb 1 09:08:50 SERVER [12380.068180] [] > Feb 1 09:08:50 SERVER qdisc_watchdog+0x0/0x1b > Feb 1 09:08:50 SERVER [12380.068199] [] > Feb 1 09:08:50 SERVER qdisc_watchdog+0x18/0x1b > Feb 1 09:08:50 SERVER [12380.068218] [] > Feb 1 09:08:50 SERVER run_hrtimer_softirq+0x4e/0x96 > Feb 1 09:08:50 SERVER [12380.068241] [] > Feb 1 09:08:50 SERVER __do_softirq+0x5d/0xc1 > Feb 1 09:08:50 SERVER [12380.068260] [] > Feb 1 09:08:50 SERVER do_softirq+0x32/0x36 > Feb 1 09:08:50 SERVER [12380.068279] [] > Feb 1 09:08:50 SERVER irq_exit+0x38/0x6b > Feb 1 09:08:50 SERVER [12380.068298] [] > Feb 1 09:08:50 SERVER smp_apic_timer_interrupt+0x74/0x80 > Feb 1 09:08:50 SERVER [12380.068319] [] > Feb 1 09:08:50 SERVER apic_timer_interrupt+0x28/0x30 > Feb 1 09:08:50 SERVER [12380.068343] [] > Feb 1 09:08:50 SERVER mwait_idle_with_hints+0x3c/0x40 > Feb 1 09:08:50 SERVER [12380.068365] [] > Feb 1 09:08:50 SERVER mwait_idle+0x0/0xa > Feb 1 09:08:50 SERVER [12380.068384] [] > Feb 1 09:08:50 SERVER cpu_idle+0x98/0xb9 > Feb 1 09:08:50 SERVER [12380.068403] [] > Feb 1 09:08:50 SERVER start_kernel+0x2d7/0x2df > Feb 1 09:08:50 SERVER [12380.068422] [] > Feb 1 09:08:50 SERVER unknown_bootoption+0x0/0x195 > Feb 1 09:08:50 SERVER [12380.068444] === What do you see if you build with CONFIG_HIGH_RES_TIMERS=n Does it work better if you boot with "acpi=off"? if yes, how about with just pnpacpi=off? thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-git9 ACPI oops - regression
On Friday 01 February 2008 09:29, Rafael J. Wysocki wrote: > [Relevant CCs added.] > > On Friday, 1 of February 2008, Lukas Hejtmanek wrote: > > Hello, > > > > I encountered oops on my laptop Lenovo T61 and dock. If I press undock on > > the > > dock, I got the following oops: this worked in 2.6.24? You are running the "dock" driver in both cases? Assuming yes... Since the dock driver itself hasn't changed, nor has the ACPI core, something in the framwork changed Is this reproducible? If yes, please open a bugzilla entry here: http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI Component = Config-hotplug and attach the output from acpidump thanks, -Len > > [79721.755165] BUG: unable to handle kernel paging request at > > 0034000e > > [79721.755165] IP: [] acpi_ns_map_handle_to_node+0x14/0x1d > > [79721.755165] PGD 7203a067 PUD 0 > > [79721.755165] Oops: [1] SMP > > [79721.755165] CPU 1 > > [79721.755165] Modules linked in: usb_storage i915 drm rfcomm l2cap > > bluetooth fuse arc4 ecb crypto_blkcipher ehci_hcd uhci_hcd e1000e > > snd_hda_intel intel_agp thinkpad_acpi [last unloaded: cfg80211] > > [79721.755165] Pid: 69, comm: kacpi_notify Not tainted 2.6.24-git9 #4 > > [79721.755165] RIP: 0010:[] [] > > acpi_ns_map_handle_to_node+0x14/0x1d > > [79721.755165] RSP: 0018:81007d197d28 EFLAGS: 00010246 > > [79721.755165] RAX: RBX: RCX: > > > > [79721.755165] RDX: 00017567 RSI: 0012c0d0 RDI: > > 00340006 > > [79721.755165] RBP: R08: 810003c0 R09: > > > > [79721.755165] R10: 0002 R11: 0001 R12: > > 00340006 > > [79721.755165] R13: R14: 81007d197d68 R15: > > 81007d197d70 > > [79721.755165] FS: () GS:81007d00e380() > > knlGS: > > [79721.755165] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b > > [79721.755165] CR2: 0034000e CR3: 7200a000 CR4: > > 06a0 > > [79721.755165] DR0: DR1: DR2: > > > > [79721.755165] DR3: DR6: 0ff0 DR7: > > 0400 > > [79721.755165] Process kacpi_notify (pid: 69, threadinfo 81007d196000, > > task 81007d18ef20) > > [79721.755165] Stack: 8034e19f 8100519e8800 81007d197de0 > > 0001 > > [79721.755165] 0001 80357d3d > > 81007d02cdc0 > > [79721.755165] 00340006 > > 8058cbaa > > [79721.755165] Call Trace: > > [79721.755165] [] acpi_get_next_object+0x3b/0x99 > > [79721.755165] [] acpi_bus_trim+0x57/0x109 > > [79721.755165] [] hotplug_dock_devices+0x97/0x117 > > [79721.755166] [] handle_eject_request+0x4e/0xd3 > > [79721.755166] [] acpi_os_execute_notify+0x0/0x2c > > [79721.755166] [] acpi_bus_get_device+0x1d/0x2e > > [79721.755166] [] acpi_bus_notify+0x17/0x4c > > [79721.755166] [] acpi_os_execute_notify+0x0/0x2c > > [79721.755166] [] acpi_ev_notify_dispatch+0x57/0x60 > > [79721.755166] [] acpi_os_execute_notify+0x23/0x2c > > [79721.755166] [] run_workqueue+0xbf/0x160 > > [79721.755166] [] worker_thread+0x0/0x100 > > [79721.755166] [] worker_thread+0x0/0x100 > > [79721.755166] [] worker_thread+0x9f/0x100 > > [79721.755166] [] autoremove_wake_function+0x0/0x30 > > [79721.755166] [] worker_thread+0x0/0x100 > > [79721.755166] [] worker_thread+0x0/0x100 > > [79721.755166] [] kthread+0x4b/0x80 > > [79721.755166] [] child_rip+0xa/0x12 > > [79721.755166] [] kthread+0x0/0x80 > > [79721.755166] [] child_rip+0x0/0x12 > > [79721.755166] > > [79721.755166] > > [79721.755166] Code: 00 c3 49 ff c1 48 83 c6 04 41 ff c8 45 85 c0 75 a8 31 > > c0 c6 06 00 c3 48 8d 47 ff 48 83 f8 fd 76 08 48 8b 05 dc 49 39 00 c3 31 c0 > > <80> 7f 08 0f 48 0f 44 c7 c3 48 89 f8 c3 31 c0 48 85 ff 74 0d f6 > > [79721.755166] RIP [] > > acpi_ns_map_handle_to_node+0x14/0x1d > > [79721.755166] RSP > > [79721.755166] CR2: 0034000e > > [79721.755166] ---[ end trace 05a1f30122eb8809 ]--- > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel panic on 2.6.24/iTCO_wdt not rebooting machine
On Friday 01 February 2008 14:15, Denys Fedoryshchenko wrote: > > On Fri, 1 Feb 2008 12:11:41 -0500, Len Brown wrote > > > > What do you see if you build with CONFIG_HIGH_RES_TIMERS=n > > > > Does it work better if you boot with "acpi=off"? > > if yes, how about with just pnpacpi=off? > > > > thanks, > > -Len > > It is not very easy to test. About bug - most probably it is related to third > party ESFQ patch, i will drop it and then test more properly when i will be > able to make watchdog work fine. But more important i notice - that iTCO_wdt > is not working at all. I think hrtimers doesn't change anything on that. > About testing, i cannot take even small risk now(and near 3-5 days) by > changing kernel options, i set now maximum available set of watchdogs, cause > there is noone to maintain server, area is unreachable because of snow and > bad weather. > > Do you think reasonable to try acpi / pnpacpi with iTCO_wdt to make it work? > Maybe just registers addresses or way how TCO watchdog activated changed on > this chipset? yes, i'm wondering if the changes in IO resource reservations in the PNPACPI layer are interfering with the native driver. unfortunately, if you boot with acpi=off or pnpacpi=off, you may run into other, unrelated, issues (or not). one way to isolate the problem is if you revert these two lines from their 2.6.24 values to their 2.6.23 values by applying this patch: --- diff --git a/include/linux/pnp.h b/include/linux/pnp.h index 2a6d62c..16b46aa 100644 --- a/include/linux/pnp.h +++ b/include/linux/pnp.h @@ -13,8 +13,8 @@ #include #include -#define PNP_MAX_PORT 40 -#define PNP_MAX_MEM12 +#define PNP_MAX_PORT 8 +#define PNP_MAX_MEM4 #define PNP_MAX_IRQ2 #define PNP_MAX_DMA2 #define PNP_NAME_LEN 50 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/