Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Kevin Hilman had written, on 03/18/2010 05:49 PM, the following: Nishanth Menon n...@ti.com writes: BUG_ON should not ideally contain a functional code. Remove it out. True. But this code should not be using BUG_ON() in the first place. We should not crash the whole kernel in this case, just fail with a warning. If you're cleaning this up, can you make it fail more gracefully. I agree if this was a preipheral driver or a non-critical path. but in this case: a) we are speaking of a core description of the h/w - OPPs frequencies and voltages which out which the functionality of the system is at stake. I am not speaking of just having a basic kernel boot up to shell prompt - we need the kernel to do much better than that. b) Is there any reason why the registration could fail - if it did fail at this point, there is something catastrophic happening - some other driver is going beserk OR Opp layer is by itself screwed up - why continue if we can warn the system and force a fix of the code? c) is there a recovery mechanism to put the system back in a usable mode with dvfs etc? I might prefer to get some ideas on it.. Kevin Ref: http://marc.info/?l=linux-kernelm=109391212925546w=2 Cc: Ambresh K ambr...@ti.com Cc: Benoit Cousson b-cous...@ti.com Cc: Eduardo Valentin eduardo.valen...@nokia.com Cc: Kevin Hilman khil...@deeprootsystems.com Cc: Phil Carmody ext-phil.2.carm...@nokia.com Cc: Sanjeev Premi pr...@ti.com Cc: Tero Kristo tero.kri...@nokia.com Cc: Thara Gopinath th...@ti.com Signed-off-by: Nishanth Menon n...@ti.com --- arch/arm/mach-omap2/cpufreq34xx.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-omap2/cpufreq34xx.c b/arch/arm/mach-omap2/cpufreq34xx.c index c453ec5..f0ed3ae 100644 --- a/arch/arm/mach-omap2/cpufreq34xx.c +++ b/arch/arm/mach-omap2/cpufreq34xx.c @@ -111,6 +111,7 @@ static struct omap_opp_def __initdata omap36xx_dsp_rate_table[] = { void __init omap3_pm_init_opp_table(void) { + int r; struct omap_opp_def **omap3_opp_def_list; struct omap_opp_def *omap34xx_opp_def_list[] = { omap34xx_mpu_rate_table, @@ -126,8 +127,9 @@ void __init omap3_pm_init_opp_table(void) omap3_opp_def_list = cpu_is_omap3630() ? omap36xx_opp_def_list : omap34xx_opp_def_list; - BUG_ON(opp_init_list(OPP_MPU, omap3_opp_def_list[0])); - BUG_ON(opp_init_list(OPP_L3, omap3_opp_def_list[1])); - BUG_ON(opp_init_list(OPP_DSP, omap3_opp_def_list[2])); + r = opp_init_list(OPP_MPU, omap3_opp_def_list[0]); + r |= opp_init_list(OPP_L3, omap3_opp_def_list[1]); + r |= opp_init_list(OPP_DSP, omap3_opp_def_list[2]); + BUG_ON(r); } -- 1.6.3.3 -- Regards, Nishanth Menon -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
On Fri, Mar 19, 2010 at 03:21:34PM +0100, ext Nishanth Menon wrote: Kevin Hilman had written, on 03/18/2010 05:49 PM, the following: Nishanth Menon n...@ti.com writes: BUG_ON should not ideally contain a functional code. Remove it out. True. But this code should not be using BUG_ON() in the first place. We should not crash the whole kernel in this case, just fail with a warning. If you're cleaning this up, can you make it fail more gracefully. I agree if this was a preipheral driver or a non-critical path. but in this case: a) we are speaking of a core description of the h/w - OPPs frequencies and voltages which out which the functionality of the system is at stake. I am not speaking of just having a basic kernel boot up to shell prompt - we need the kernel to do much better than that. b) Is there any reason why the registration could fail - if it did fail at this point, there is something catastrophic happening - some other driver is going beserk OR Opp layer is by itself screwed up - why continue if we can warn the system and force a fix of the code? I agree with Nishanth here. If it fails at this point, we deserve the BUG(). -- balbi -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Nishanth Menon n...@ti.com writes: Kevin Hilman had written, on 03/18/2010 05:49 PM, the following: Nishanth Menon n...@ti.com writes: BUG_ON should not ideally contain a functional code. Remove it out. True. But this code should not be using BUG_ON() in the first place. We should not crash the whole kernel in this case, just fail with a warning. If you're cleaning this up, can you make it fail more gracefully. I agree if this was a preipheral driver or a non-critical path. but in this case: a) we are speaking of a core description of the h/w - OPPs frequencies and voltages which out which the functionality of the system is at stake. I am not speaking of just having a basic kernel boot up to shell prompt - we need the kernel to do much better than that. A system can boot fine without OPPs/DVFS. OPPs will not be registered with CPUfreq, and no DVFS atempts will be made. b) Is there any reason why the registration could fail - if it did fail at this point, there is something catastrophic happening - some other driver is going beserk OR Opp layer is by itself screwed up - why continue if we can warn the system and force a fix of the code? Using WARN() will produce a nice loud message that alert users, get reported and get fixed as well. c) is there a recovery mechanism to put the system back in a usable mode with dvfs etc? I might prefer to get some ideas on it.. What is to recover from? While not optimal in power/performance, a kernel can boot and work just fine without OPPs/DVFS. If this call fails, no DVFS will be available but the system is still quite usable. The bigger problem is that everyone things that their feature/subsystem is so crucial that any problems should hang the system, when things could actually continue just fine without it in most cases. IMO, Using BUG* macros usually indicates improper or incomplete error handling rather than a real catastrophic system failure. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote: IMO, Using BUG* macros usually indicates improper or incomplete error handling rather than a real catastrophic system failure. on the other hand a kernel oops and system hang will always get noted. Rather than a WARN() which simply sits in the log buffer. Remember that not all users will be looking at console. If there's a kernel hang, we get a core dump which is saved in the mtdoops (if you're using it) and can send later to some central server for analysis. -- balbi -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Felipe Balbi m...@felipebalbi.com writes: On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote: IMO, Using BUG* macros usually indicates improper or incomplete error handling rather than a real catastrophic system failure. on the other hand a kernel oops and system hang will always get noted. Rather than a WARN() which simply sits in the log buffer. Of course, but what I'm trying to avoid is making other people deal with a BUG inserted by a developer when proper error checking and recovery is what is really needed. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Kevin Hilman had written, on 03/19/2010 01:42 PM, the following: Felipe Balbi m...@felipebalbi.com writes: On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote: IMO, Using BUG* macros usually indicates improper or incomplete error handling rather than a real catastrophic system failure. on the other hand a kernel oops and system hang will always get noted. Rather than a WARN() which simply sits in the log buffer. Of course, but what I'm trying to avoid is making other people deal with a BUG inserted by a developer when proper error checking and recovery is what is really needed. I respect your views. but a few moments of thoughts: how would the recovery look like? I can think of 2 options here.. do share your views: Option 1: if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) { WARN(dsp OPP table registration failed); return; } if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) { WARN(dsp OPP table registration failed); return; } if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) { WARN(dsp OPP table registration failed); return; } Option 2: if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) return; if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) goto mpu_disable; if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) goto l3_disable; return; l3_disable: freq = 0; while (!IS_ERR(opp = opp_find_freq_ceil(OPP_L3, freq)) { opp_disable(opp); freq++; } mpu_disable: freq = 0; while (!IS_ERR(opp = opp_find_freq_ceil(OPP_MPU, freq)) { opp_disable(opp); freq++; } WARN(Registration of OPP tables failed!!); return; Option 1 is a bad idea as it leaves the system in an invalid state Option 2 is the better idea as we dont have a opp_delete option(not required usually). All that code for something that will almost never happen? -- Regards, Nishanth Menon -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Nishanth Menon n...@ti.com writes: Kevin Hilman had written, on 03/19/2010 01:42 PM, the following: Felipe Balbi m...@felipebalbi.com writes: On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote: IMO, Using BUG* macros usually indicates improper or incomplete error handling rather than a real catastrophic system failure. on the other hand a kernel oops and system hang will always get noted. Rather than a WARN() which simply sits in the log buffer. Of course, but what I'm trying to avoid is making other people deal with a BUG inserted by a developer when proper error checking and recovery is what is really needed. I respect your views. but a few moments of thoughts: how would the recovery look like? I can think of 2 options here.. do share your views: Option 1: if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) { WARN(dsp OPP table registration failed); return; } if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) { WARN(dsp OPP table registration failed); return; } if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) { WARN(dsp OPP table registration failed); return; } Option 2: if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) return; if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) goto mpu_disable; if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) goto l3_disable; return; l3_disable: freq = 0; while (!IS_ERR(opp = opp_find_freq_ceil(OPP_L3, freq)) { opp_disable(opp); freq++; } mpu_disable: freq = 0; while (!IS_ERR(opp = opp_find_freq_ceil(OPP_MPU, freq)) { opp_disable(opp); freq++; } WARN(Registration of OPP tables failed!!); return; Option 1 is a bad idea as it leaves the system in an invalid state Option 2 is the better idea as we dont have a opp_delete option(not required usually). I'm OK with either actually, as either is better than BUG_ON() instead of error checking. With option 1, the system is not really in an invalid state, just and untested state. What will happen is users of the OPP API will error codes when trying to get OPPs and they should fail gracefully as well. Once again, this is about proper error checking, and robustness, not about causing a panic() when something (relatively) minor happens. All that code for something that will almost never happen? Yes. That's what error checking is all about. Kevin P.S. I can't find the ref atm, but I mentioned not liking the BUG_ON early in the review of the OPP layer, and it would need to be cleaned up before upstream merge. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Kevin Hilman had written, on 03/19/2010 03:49 PM, the following: Nishanth Menon n...@ti.com writes: Kevin Hilman had written, on 03/19/2010 01:42 PM, the following: Felipe Balbi m...@felipebalbi.com writes: On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote: IMO, Using BUG* macros usually indicates improper or incomplete error handling rather than a real catastrophic system failure. on the other hand a kernel oops and system hang will always get noted. Rather than a WARN() which simply sits in the log buffer. Of course, but what I'm trying to avoid is making other people deal with a BUG inserted by a developer when proper error checking and recovery is what is really needed. I respect your views. but a few moments of thoughts: how would the recovery look like? I can think of 2 options here.. do share your views: Option 1: if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) { WARN(dsp OPP table registration failed); return; } if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) { WARN(dsp OPP table registration failed); return; } if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) { WARN(dsp OPP table registration failed); return; } Option 2: if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) return; if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) goto mpu_disable; if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) goto l3_disable; return; l3_disable: freq = 0; while (!IS_ERR(opp = opp_find_freq_ceil(OPP_L3, freq)) { opp_disable(opp); freq++; } mpu_disable: freq = 0; while (!IS_ERR(opp = opp_find_freq_ceil(OPP_MPU, freq)) { opp_disable(opp); freq++; } WARN(Registration of OPP tables failed!!); return; Option 1 is a bad idea as it leaves the system in an invalid state Option 2 is the better idea as we dont have a opp_delete option(not required usually). I'm OK with either actually, as either is better than BUG_ON() instead of error checking. With option 1, the system is not really in an invalid state, just and untested state. What will happen is users of the OPP API will error codes when trying to get OPPs and they should fail gracefully as well. except if L3/DSP reg fails, you allow MPU freqs configured by no DSP or L3 freq.. prefer the option 2 in that respect.. it is binary - you get all the configurations done right OR you dont. Once again, this is about proper error checking, and robustness, not about causing a panic() when something (relatively) minor happens. All that code for something that will almost never happen? Yes. That's what error checking is all about. Kevin P.S. I can't find the ref atm, but I mentioned not liking the BUG_ON early in the review of the OPP layer, and it would need to be cleaned up before upstream merge. alrite alrite.. i am black and blue with the beatings ;). I am going with option 2 if there are no other objections.. will send out the patch after collating other comments. -- Regards, Nishanth Menon -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
BUG_ON should not ideally contain a functional code. Remove it out. Ref: http://marc.info/?l=linux-kernelm=109391212925546w=2 Cc: Ambresh K ambr...@ti.com Cc: Benoit Cousson b-cous...@ti.com Cc: Eduardo Valentin eduardo.valen...@nokia.com Cc: Kevin Hilman khil...@deeprootsystems.com Cc: Phil Carmody ext-phil.2.carm...@nokia.com Cc: Sanjeev Premi pr...@ti.com Cc: Tero Kristo tero.kri...@nokia.com Cc: Thara Gopinath th...@ti.com Signed-off-by: Nishanth Menon n...@ti.com --- arch/arm/mach-omap2/cpufreq34xx.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-omap2/cpufreq34xx.c b/arch/arm/mach-omap2/cpufreq34xx.c index c453ec5..f0ed3ae 100644 --- a/arch/arm/mach-omap2/cpufreq34xx.c +++ b/arch/arm/mach-omap2/cpufreq34xx.c @@ -111,6 +111,7 @@ static struct omap_opp_def __initdata omap36xx_dsp_rate_table[] = { void __init omap3_pm_init_opp_table(void) { + int r; struct omap_opp_def **omap3_opp_def_list; struct omap_opp_def *omap34xx_opp_def_list[] = { omap34xx_mpu_rate_table, @@ -126,8 +127,9 @@ void __init omap3_pm_init_opp_table(void) omap3_opp_def_list = cpu_is_omap3630() ? omap36xx_opp_def_list : omap34xx_opp_def_list; - BUG_ON(opp_init_list(OPP_MPU, omap3_opp_def_list[0])); - BUG_ON(opp_init_list(OPP_L3, omap3_opp_def_list[1])); - BUG_ON(opp_init_list(OPP_DSP, omap3_opp_def_list[2])); + r = opp_init_list(OPP_MPU, omap3_opp_def_list[0]); + r |= opp_init_list(OPP_L3, omap3_opp_def_list[1]); + r |= opp_init_list(OPP_DSP, omap3_opp_def_list[2]); + BUG_ON(r); } -- 1.6.3.3 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup
Nishanth Menon n...@ti.com writes: BUG_ON should not ideally contain a functional code. Remove it out. True. But this code should not be using BUG_ON() in the first place. We should not crash the whole kernel in this case, just fail with a warning. If you're cleaning this up, can you make it fail more gracefully. Kevin Ref: http://marc.info/?l=linux-kernelm=109391212925546w=2 Cc: Ambresh K ambr...@ti.com Cc: Benoit Cousson b-cous...@ti.com Cc: Eduardo Valentin eduardo.valen...@nokia.com Cc: Kevin Hilman khil...@deeprootsystems.com Cc: Phil Carmody ext-phil.2.carm...@nokia.com Cc: Sanjeev Premi pr...@ti.com Cc: Tero Kristo tero.kri...@nokia.com Cc: Thara Gopinath th...@ti.com Signed-off-by: Nishanth Menon n...@ti.com --- arch/arm/mach-omap2/cpufreq34xx.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-omap2/cpufreq34xx.c b/arch/arm/mach-omap2/cpufreq34xx.c index c453ec5..f0ed3ae 100644 --- a/arch/arm/mach-omap2/cpufreq34xx.c +++ b/arch/arm/mach-omap2/cpufreq34xx.c @@ -111,6 +111,7 @@ static struct omap_opp_def __initdata omap36xx_dsp_rate_table[] = { void __init omap3_pm_init_opp_table(void) { + int r; struct omap_opp_def **omap3_opp_def_list; struct omap_opp_def *omap34xx_opp_def_list[] = { omap34xx_mpu_rate_table, @@ -126,8 +127,9 @@ void __init omap3_pm_init_opp_table(void) omap3_opp_def_list = cpu_is_omap3630() ? omap36xx_opp_def_list : omap34xx_opp_def_list; - BUG_ON(opp_init_list(OPP_MPU, omap3_opp_def_list[0])); - BUG_ON(opp_init_list(OPP_L3, omap3_opp_def_list[1])); - BUG_ON(opp_init_list(OPP_DSP, omap3_opp_def_list[2])); + r = opp_init_list(OPP_MPU, omap3_opp_def_list[0]); + r |= opp_init_list(OPP_L3, omap3_opp_def_list[1]); + r |= opp_init_list(OPP_DSP, omap3_opp_def_list[2]); + BUG_ON(r); } -- 1.6.3.3 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html