Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Nishanth Menon

Kevin Hilman had written, on 03/18/2010 05:49 PM, the following:

Nishanth Menon n...@ti.com writes:


BUG_ON should not ideally contain a functional code. Remove it out.


True.  But this code should not be using BUG_ON() in the first place.

We should not crash the whole kernel in this case, just fail
with a warning.

If you're cleaning this up, can you make it fail more gracefully.
I agree  if this was a preipheral driver or a non-critical path. but in 
this case:


a) we are speaking of a core description of the h/w - OPPs frequencies 
and voltages which out which the functionality of the system is at 
stake. I am not speaking of just having a basic kernel boot up to shell 
prompt - we need the kernel to do much better than that.


b) Is there any reason why the registration could fail - if it did fail 
at this point, there is something catastrophic happening - some other 
driver is going beserk OR Opp layer is by itself screwed up - why 
continue if we can warn the system and force a fix of the code?


c) is there a recovery mechanism to put the system back in a usable mode 
with dvfs etc? I might prefer to get some ideas on it..




Kevin



Ref: http://marc.info/?l=linux-kernelm=109391212925546w=2

Cc: Ambresh K ambr...@ti.com
Cc: Benoit Cousson b-cous...@ti.com
Cc: Eduardo Valentin eduardo.valen...@nokia.com
Cc: Kevin Hilman khil...@deeprootsystems.com
Cc: Phil Carmody ext-phil.2.carm...@nokia.com
Cc: Sanjeev Premi pr...@ti.com
Cc: Tero Kristo tero.kri...@nokia.com
Cc: Thara Gopinath th...@ti.com

Signed-off-by: Nishanth Menon n...@ti.com
---
 arch/arm/mach-omap2/cpufreq34xx.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-omap2/cpufreq34xx.c 
b/arch/arm/mach-omap2/cpufreq34xx.c
index c453ec5..f0ed3ae 100644
--- a/arch/arm/mach-omap2/cpufreq34xx.c
+++ b/arch/arm/mach-omap2/cpufreq34xx.c
@@ -111,6 +111,7 @@ static struct omap_opp_def __initdata 
omap36xx_dsp_rate_table[] = {
 
 void __init omap3_pm_init_opp_table(void)

 {
+   int r;
struct omap_opp_def **omap3_opp_def_list;
struct omap_opp_def *omap34xx_opp_def_list[] = {
omap34xx_mpu_rate_table,
@@ -126,8 +127,9 @@ void __init omap3_pm_init_opp_table(void)
omap3_opp_def_list = cpu_is_omap3630() ? omap36xx_opp_def_list :
omap34xx_opp_def_list;
 
-	BUG_ON(opp_init_list(OPP_MPU, omap3_opp_def_list[0]));

-   BUG_ON(opp_init_list(OPP_L3, omap3_opp_def_list[1]));
-   BUG_ON(opp_init_list(OPP_DSP, omap3_opp_def_list[2]));
+   r = opp_init_list(OPP_MPU, omap3_opp_def_list[0]);
+   r |= opp_init_list(OPP_L3, omap3_opp_def_list[1]);
+   r |= opp_init_list(OPP_DSP, omap3_opp_def_list[2]);
+   BUG_ON(r);
 }
 
--

1.6.3.3



--
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Felipe Balbi

On Fri, Mar 19, 2010 at 03:21:34PM +0100, ext Nishanth Menon wrote:

Kevin Hilman had written, on 03/18/2010 05:49 PM, the following:

Nishanth Menon n...@ti.com writes:


BUG_ON should not ideally contain a functional code. Remove it out.


True.  But this code should not be using BUG_ON() in the first place.

We should not crash the whole kernel in this case, just fail
with a warning.

If you're cleaning this up, can you make it fail more gracefully.

I agree  if this was a preipheral driver or a non-critical path. but in
this case:

a) we are speaking of a core description of the h/w - OPPs frequencies
and voltages which out which the functionality of the system is at
stake. I am not speaking of just having a basic kernel boot up to shell
prompt - we need the kernel to do much better than that.

b) Is there any reason why the registration could fail - if it did fail
at this point, there is something catastrophic happening - some other
driver is going beserk OR Opp layer is by itself screwed up - why
continue if we can warn the system and force a fix of the code?


I agree with Nishanth here. If it fails at this point, we deserve the 
BUG().


--
balbi
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Kevin Hilman
Nishanth Menon n...@ti.com writes:

 Kevin Hilman had written, on 03/18/2010 05:49 PM, the following:
 Nishanth Menon n...@ti.com writes:

 BUG_ON should not ideally contain a functional code. Remove it out.

 True.  But this code should not be using BUG_ON() in the first place.

 We should not crash the whole kernel in this case, just fail
 with a warning.

 If you're cleaning this up, can you make it fail more gracefully.
 I agree  if this was a preipheral driver or a non-critical path. but
 in this case:

 a) we are speaking of a core description of the h/w - OPPs frequencies
 and voltages which out which the functionality of the system is at
 stake. I am not speaking of just having a basic kernel boot up to
 shell prompt - we need the kernel to do much better than that.

A system can boot fine without OPPs/DVFS.  OPPs will not be
registered with CPUfreq, and no DVFS atempts will be made.  

 b) Is there any reason why the registration could fail - if it did
 fail at this point, there is something catastrophic happening - some
 other driver is going beserk OR Opp layer is by itself screwed up -
 why continue if we can warn the system and force a fix of the code?

Using WARN() will produce a nice loud message that alert users, get
reported and get fixed as well.

 c) is there a recovery mechanism to put the system back in a usable
 mode with dvfs etc? I might prefer to get some ideas on it..

What is to recover from?  While not optimal in power/performance, a
kernel can boot and work just fine without OPPs/DVFS.  If this call
fails, no DVFS will be available but the system is still quite usable.

The bigger problem is that everyone things that their
feature/subsystem is so crucial that any problems should hang the
system, when things could actually continue just fine without it in
most cases.

IMO, Using BUG* macros usually indicates improper or incomplete error
handling rather than a real catastrophic system failure.

Kevin

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Felipe Balbi
On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote:
 IMO, Using BUG* macros usually indicates improper or incomplete error
 handling rather than a real catastrophic system failure.

on the other hand a kernel oops and system hang will always get noted. Rather
than a WARN() which simply sits in the log buffer. Remember that not all
users will be looking at console. If there's a kernel hang, we get a
core dump which is saved in the mtdoops (if you're using it) and can
send later to some central server for analysis.

-- 
balbi
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Kevin Hilman
Felipe Balbi m...@felipebalbi.com writes:

 On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote:
 IMO, Using BUG* macros usually indicates improper or incomplete error
 handling rather than a real catastrophic system failure.

 on the other hand a kernel oops and system hang will always get
 noted. Rather than a WARN() which simply sits in the log buffer.

Of course, but what I'm trying to avoid is making other people deal
with a BUG inserted by a developer when proper error checking and
recovery is what is really needed.

Kevin
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Nishanth Menon

Kevin Hilman had written, on 03/19/2010 01:42 PM, the following:

Felipe Balbi m...@felipebalbi.com writes:


On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote:

IMO, Using BUG* macros usually indicates improper or incomplete error
handling rather than a real catastrophic system failure.

on the other hand a kernel oops and system hang will always get
noted. Rather than a WARN() which simply sits in the log buffer.


Of course, but what I'm trying to avoid is making other people deal
with a BUG inserted by a developer when proper error checking and
recovery is what is really needed.


I respect your views. but a few moments of thoughts:
how would the recovery look like? I can think of 2 options here.. do 
share your views:


Option 1:
if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) {
WARN(dsp OPP table registration failed);
return;
}
if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) {
WARN(dsp OPP table registration failed);
return;
}
if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) {
WARN(dsp OPP table registration failed);
return;
}

Option 2:
if (opp_init_list(OPP_MPU, omap3_opp_def_list[0]))
return;
if (opp_init_list(OPP_L3, omap3_opp_def_list[1]))
goto mpu_disable;
if (opp_init_list(OPP_DSP, omap3_opp_def_list[2]))
goto l3_disable;
return;

l3_disable:
freq = 0;
while (!IS_ERR(opp = opp_find_freq_ceil(OPP_L3, freq)) {
opp_disable(opp);
freq++;
}
mpu_disable:
freq = 0;
while (!IS_ERR(opp = opp_find_freq_ceil(OPP_MPU, freq)) {
opp_disable(opp);
freq++;
}
WARN(Registration of OPP tables failed!!);
return;

Option 1 is a bad idea as it leaves the system in an invalid state
Option 2 is the better idea as we dont have a opp_delete option(not 
required usually).


All that code for something that will almost never happen?  
--
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Kevin Hilman
Nishanth Menon n...@ti.com writes:

 Kevin Hilman had written, on 03/19/2010 01:42 PM, the following:
 Felipe Balbi m...@felipebalbi.com writes:

 On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote:
 IMO, Using BUG* macros usually indicates improper or incomplete error
 handling rather than a real catastrophic system failure.
 on the other hand a kernel oops and system hang will always get
 noted. Rather than a WARN() which simply sits in the log buffer.

 Of course, but what I'm trying to avoid is making other people deal
 with a BUG inserted by a developer when proper error checking and
 recovery is what is really needed.

 I respect your views. but a few moments of thoughts:
 how would the recovery look like? I can think of 2 options here.. do
 share your views:

 Option 1:
 if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) {
   WARN(dsp OPP table registration failed);
   return;
 }
 if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) {
   WARN(dsp OPP table registration failed);
   return;
 }
 if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) {
   WARN(dsp OPP table registration failed);
   return;
 }

 Option 2:
   if (opp_init_list(OPP_MPU, omap3_opp_def_list[0]))
   return;
   if (opp_init_list(OPP_L3, omap3_opp_def_list[1]))
   goto mpu_disable;
   if (opp_init_list(OPP_DSP, omap3_opp_def_list[2]))
   goto l3_disable;
   return;

 l3_disable:
   freq = 0;
   while (!IS_ERR(opp = opp_find_freq_ceil(OPP_L3, freq)) {
   opp_disable(opp);
   freq++;
   }
 mpu_disable:
   freq = 0;
   while (!IS_ERR(opp = opp_find_freq_ceil(OPP_MPU, freq)) {
   opp_disable(opp);
   freq++;
   }
   WARN(Registration of OPP tables failed!!);
   return;

 Option 1 is a bad idea as it leaves the system in an invalid state
 Option 2 is the better idea as we dont have a opp_delete option(not
 required usually).

I'm OK with either actually, as either is better than BUG_ON()
instead of error checking.

With option 1, the system is not really in an invalid state, just and
untested state.  What will happen is users of the OPP API will error
codes when trying to get OPPs and they should fail gracefully as well.

Once again, this is about proper error checking, and robustness, not
about causing a panic() when something (relatively) minor happens.

 All that code for something that will almost never happen?

Yes.  That's what error checking is all about.

Kevin

P.S. I can't find the ref atm, but I mentioned not liking the BUG_ON
 early in the review of the OPP layer, and it would need to be
 cleaned up before upstream merge.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-19 Thread Nishanth Menon

Kevin Hilman had written, on 03/19/2010 03:49 PM, the following:

Nishanth Menon n...@ti.com writes:


Kevin Hilman had written, on 03/19/2010 01:42 PM, the following:

Felipe Balbi m...@felipebalbi.com writes:


On Fri, Mar 19, 2010 at 10:46:54AM -0700, Kevin Hilman wrote:

IMO, Using BUG* macros usually indicates improper or incomplete error
handling rather than a real catastrophic system failure.

on the other hand a kernel oops and system hang will always get
noted. Rather than a WARN() which simply sits in the log buffer.

Of course, but what I'm trying to avoid is making other people deal
with a BUG inserted by a developer when proper error checking and
recovery is what is really needed.


I respect your views. but a few moments of thoughts:
how would the recovery look like? I can think of 2 options here.. do
share your views:

Option 1:
if (opp_init_list(OPP_MPU, omap3_opp_def_list[0])) {
WARN(dsp OPP table registration failed);
return;
}
if (opp_init_list(OPP_L3, omap3_opp_def_list[1])) {
WARN(dsp OPP table registration failed);
return;
}
if (opp_init_list(OPP_DSP, omap3_opp_def_list[2])) {
WARN(dsp OPP table registration failed);
return;
}

Option 2:
if (opp_init_list(OPP_MPU, omap3_opp_def_list[0]))
return;
if (opp_init_list(OPP_L3, omap3_opp_def_list[1]))
goto mpu_disable;
if (opp_init_list(OPP_DSP, omap3_opp_def_list[2]))
goto l3_disable;
return;

l3_disable:
freq = 0;
while (!IS_ERR(opp = opp_find_freq_ceil(OPP_L3, freq)) {
opp_disable(opp);
freq++;
}
mpu_disable:
freq = 0;
while (!IS_ERR(opp = opp_find_freq_ceil(OPP_MPU, freq)) {
opp_disable(opp);
freq++;
}
WARN(Registration of OPP tables failed!!);
return;

Option 1 is a bad idea as it leaves the system in an invalid state
Option 2 is the better idea as we dont have a opp_delete option(not
required usually).


I'm OK with either actually, as either is better than BUG_ON()
instead of error checking.

With option 1, the system is not really in an invalid state, just and
untested state.  What will happen is users of the OPP API will error
codes when trying to get OPPs and they should fail gracefully as well.
except if L3/DSP reg fails, you allow MPU freqs configured by no DSP or 
L3 freq.. prefer the option 2 in that respect.. it is binary - you get 
all the configurations done right OR you dont.


Once again, this is about proper error checking, and robustness, not
about causing a panic() when something (relatively) minor happens.


All that code for something that will almost never happen?  


Yes.  That's what error checking is all about.

Kevin

P.S. I can't find the ref atm, but I mentioned not liking the BUG_ON
 early in the review of the OPP layer, and it would need to be
 cleaned up before upstream merge.
alrite alrite.. i am black and blue with the beatings ;). I am going 
with option 2 if there are no other objections.. will send out the patch 
after collating other comments.


--
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-18 Thread Nishanth Menon
BUG_ON should not ideally contain a functional code. Remove it out.

Ref: http://marc.info/?l=linux-kernelm=109391212925546w=2

Cc: Ambresh K ambr...@ti.com
Cc: Benoit Cousson b-cous...@ti.com
Cc: Eduardo Valentin eduardo.valen...@nokia.com
Cc: Kevin Hilman khil...@deeprootsystems.com
Cc: Phil Carmody ext-phil.2.carm...@nokia.com
Cc: Sanjeev Premi pr...@ti.com
Cc: Tero Kristo tero.kri...@nokia.com
Cc: Thara Gopinath th...@ti.com

Signed-off-by: Nishanth Menon n...@ti.com
---
 arch/arm/mach-omap2/cpufreq34xx.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-omap2/cpufreq34xx.c 
b/arch/arm/mach-omap2/cpufreq34xx.c
index c453ec5..f0ed3ae 100644
--- a/arch/arm/mach-omap2/cpufreq34xx.c
+++ b/arch/arm/mach-omap2/cpufreq34xx.c
@@ -111,6 +111,7 @@ static struct omap_opp_def __initdata 
omap36xx_dsp_rate_table[] = {
 
 void __init omap3_pm_init_opp_table(void)
 {
+   int r;
struct omap_opp_def **omap3_opp_def_list;
struct omap_opp_def *omap34xx_opp_def_list[] = {
omap34xx_mpu_rate_table,
@@ -126,8 +127,9 @@ void __init omap3_pm_init_opp_table(void)
omap3_opp_def_list = cpu_is_omap3630() ? omap36xx_opp_def_list :
omap34xx_opp_def_list;
 
-   BUG_ON(opp_init_list(OPP_MPU, omap3_opp_def_list[0]));
-   BUG_ON(opp_init_list(OPP_L3, omap3_opp_def_list[1]));
-   BUG_ON(opp_init_list(OPP_DSP, omap3_opp_def_list[2]));
+   r = opp_init_list(OPP_MPU, omap3_opp_def_list[0]);
+   r |= opp_init_list(OPP_L3, omap3_opp_def_list[1]);
+   r |= opp_init_list(OPP_DSP, omap3_opp_def_list[2]);
+   BUG_ON(r);
 }
 
-- 
1.6.3.3

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PM-WIP-OPP][PATCH 1/4] omap3: pm: cpufreq: BUG_ON cleanup

2010-03-18 Thread Kevin Hilman
Nishanth Menon n...@ti.com writes:

 BUG_ON should not ideally contain a functional code. Remove it out.

True.  But this code should not be using BUG_ON() in the first place.

We should not crash the whole kernel in this case, just fail
with a warning.

If you're cleaning this up, can you make it fail more gracefully.

Kevin


 Ref: http://marc.info/?l=linux-kernelm=109391212925546w=2

 Cc: Ambresh K ambr...@ti.com
 Cc: Benoit Cousson b-cous...@ti.com
 Cc: Eduardo Valentin eduardo.valen...@nokia.com
 Cc: Kevin Hilman khil...@deeprootsystems.com
 Cc: Phil Carmody ext-phil.2.carm...@nokia.com
 Cc: Sanjeev Premi pr...@ti.com
 Cc: Tero Kristo tero.kri...@nokia.com
 Cc: Thara Gopinath th...@ti.com

 Signed-off-by: Nishanth Menon n...@ti.com
 ---
  arch/arm/mach-omap2/cpufreq34xx.c |8 +---
  1 files changed, 5 insertions(+), 3 deletions(-)

 diff --git a/arch/arm/mach-omap2/cpufreq34xx.c 
 b/arch/arm/mach-omap2/cpufreq34xx.c
 index c453ec5..f0ed3ae 100644
 --- a/arch/arm/mach-omap2/cpufreq34xx.c
 +++ b/arch/arm/mach-omap2/cpufreq34xx.c
 @@ -111,6 +111,7 @@ static struct omap_opp_def __initdata 
 omap36xx_dsp_rate_table[] = {
  
  void __init omap3_pm_init_opp_table(void)
  {
 + int r;
   struct omap_opp_def **omap3_opp_def_list;
   struct omap_opp_def *omap34xx_opp_def_list[] = {
   omap34xx_mpu_rate_table,
 @@ -126,8 +127,9 @@ void __init omap3_pm_init_opp_table(void)
   omap3_opp_def_list = cpu_is_omap3630() ? omap36xx_opp_def_list :
   omap34xx_opp_def_list;
  
 - BUG_ON(opp_init_list(OPP_MPU, omap3_opp_def_list[0]));
 - BUG_ON(opp_init_list(OPP_L3, omap3_opp_def_list[1]));
 - BUG_ON(opp_init_list(OPP_DSP, omap3_opp_def_list[2]));
 + r = opp_init_list(OPP_MPU, omap3_opp_def_list[0]);
 + r |= opp_init_list(OPP_L3, omap3_opp_def_list[1]);
 + r |= opp_init_list(OPP_DSP, omap3_opp_def_list[2]);
 + BUG_ON(r);
  }
  
 -- 
 1.6.3.3
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html