Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-08 Thread Valdis . Kletnieks
On Mon, 02 Apr 2007 10:35:40 +0200, Rene Rebe said:

(Sorry for the late reply..)

> IIRC a MSI Megabook S270 (I formerly owned) BIOS notifies this
> "Critical temperature reached (128C)" when the battery run empty
> when the OS did no action due to battery low indications. I guess
> the BIOS people thought this is a good last resort to let the OS
> really shutdown before the box just turns off.

It's not just MSI - I recently managed to put a Dell Latitude D820 into its bag
while still running, where it babbled to itself running on the warm side for
several hours.  When I finally did get it out, it *was* quite hot to the touch,
but I was amazed that it managed to run the battery down to somewhere under 4%
(which took some 4 or 5 hours) and then throw the thermal check that made it
shut down - quite the coincidence indeed.

However, "ran warm but tolerable and then used the thermal to shut down when
the battery failed" matches the symptoms much better



pgp7lmMALdsLK.pgp
Description: PGP signature


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-08 Thread Valdis . Kletnieks
On Mon, 02 Apr 2007 10:35:40 +0200, Rene Rebe said:

(Sorry for the late reply..)

 IIRC a MSI Megabook S270 (I formerly owned) BIOS notifies this
 Critical temperature reached (128C) when the battery run empty
 when the OS did no action due to battery low indications. I guess
 the BIOS people thought this is a good last resort to let the OS
 really shutdown before the box just turns off.

It's not just MSI - I recently managed to put a Dell Latitude D820 into its bag
while still running, where it babbled to itself running on the warm side for
several hours.  When I finally did get it out, it *was* quite hot to the touch,
but I was amazed that it managed to run the battery down to somewhere under 4%
(which took some 4 or 5 hours) and then throw the thermal check that made it
shut down - quite the coincidence indeed.

However, ran warm but tolerable and then used the thermal to shut down when
the battery failed matches the symptoms much better



pgp7lmMALdsLK.pgp
Description: PGP signature


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-03 Thread Henrique de Moraes Holschuh
On Tue, 03 Apr 2007, Jeremy Fitzhardinge wrote:
> Attached.  Is there some tool for decoding the DSDT?

iasl.  The documentation is the ACPI Specification.

> >> ezr:pts/1; cat /proc/acpi/ibm/thermal
> >> temperatures:   72 55 -128 65 40 -128 35 -128 51 53 -128 -128 -128 -128
> >> -128 -128
> >
> > This is a highly unusual output for thinkpads, but might be the expected one
> > for your X60, the X-series has always been a bit weird.  I'd higly suggest
> 
> How would you expect it to look?  I did some non-conclusive tests under

I would not expect a -128 on the third position.  The other two -128 are
expected, as they are the thermal sensors for the secondary battery.  There
is also one less sensor than I'd expect.

> Windows, and I'm beginning to get the feeling that there is actually a
> cooling problem with the hardware.

This must be at least the third complain I come across of a X60 which boils
the CPU.  The standard fix from Lenovo is a planar card swap (motherboard
swap).  Since this *does* mean they replace the thermal compounds, and a
full reassembly of the heat pipes, it might be that just fixing the thermal
coupling between cooling assembly and the CPU might do it.

> It doesn't seem to help.  When its failing to control cooling (temp
> creeps towards 100C while under load), its going at ~3700RPM, which is
> about what level 7 does.

Well, at least the EC is not misbehaving, then.

> What's a typical max RPM?  I'm getting the impression that there's

More than 4000rpm, in disengaged mode.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-03 Thread RusH

On 4/3/07, Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:


Attached.  Is there some tool for decoding the DSDT?


iasl
http://www.intel.com/technology/iapc/acpi/downloads.htm
http://www.intel.com/technology/iapc/acpi/license2.htm

--
Who logs in to gdm? Not I, said the duck.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-03 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Jeremy Fitzhardinge wrote:
> > You can use ibm-acpi to properly track your thinkpad thermal sensors, load
> > it with the "experimental=1" parameter, and look at what gets exported at
> > /proc/acpi/ibm/thermal.
> 
> Interesting.  The first number corresponds with the ACPI THM0
> temperature, but I can't see anything corresponding to THM1.  Is there
> something that documents what all the temperatures are measuring in an
> X60?  Thinkwiki doesn't seem to have any info.

Well, send me the DSDT and dmidecode output (mask off the UUID and serial
numbers), and I will be able to say more.

> ezr:pts/1; cat /proc/acpi/ibm/thermal
> temperatures:   72 55 -128 65 40 -128 35 -128 51 53 -128 -128 -128 -128
> -128 -128

This is a highly unusual output for thinkpads, but might be the expected one
for your X60, the X-series has always been a bit weird.  I'd higly suggest
asking for X60 thermal data from other X60 owners on the linux-thinkpad ML.
Make sure to state your X60 model number, and to request that everyone does
the same.

> > You can also use /proc/acpi/ibm/fan to check the fan's state.  And use the
> 
> It's set to auto.  Presumably that means its tied into the temperature
> sensors and will be able to keep the temp under control...

Yes, if all sensors are working fine.  That said, people override the EC fan
control all the time, because it seems not to be doing what people want.
Thinkwiki has more on this, and you want to set your fan to level 7 when
doing CPU-intensive work for now, since you are experiencing some sort of
trouble anyway...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-03 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Jeremy Fitzhardinge wrote:
  You can use ibm-acpi to properly track your thinkpad thermal sensors, load
  it with the experimental=1 parameter, and look at what gets exported at
  /proc/acpi/ibm/thermal.
 
 Interesting.  The first number corresponds with the ACPI THM0
 temperature, but I can't see anything corresponding to THM1.  Is there
 something that documents what all the temperatures are measuring in an
 X60?  Thinkwiki doesn't seem to have any info.

Well, send me the DSDT and dmidecode output (mask off the UUID and serial
numbers), and I will be able to say more.

 ezr:pts/1; cat /proc/acpi/ibm/thermal
 temperatures:   72 55 -128 65 40 -128 35 -128 51 53 -128 -128 -128 -128
 -128 -128

This is a highly unusual output for thinkpads, but might be the expected one
for your X60, the X-series has always been a bit weird.  I'd higly suggest
asking for X60 thermal data from other X60 owners on the linux-thinkpad ML.
Make sure to state your X60 model number, and to request that everyone does
the same.

  You can also use /proc/acpi/ibm/fan to check the fan's state.  And use the
 
 It's set to auto.  Presumably that means its tied into the temperature
 sensors and will be able to keep the temp under control...

Yes, if all sensors are working fine.  That said, people override the EC fan
control all the time, because it seems not to be doing what people want.
Thinkwiki has more on this, and you want to set your fan to level 7 when
doing CPU-intensive work for now, since you are experiencing some sort of
trouble anyway...

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-03 Thread RusH

On 4/3/07, Jeremy Fitzhardinge [EMAIL PROTECTED] wrote:


Attached.  Is there some tool for decoding the DSDT?


iasl
http://www.intel.com/technology/iapc/acpi/downloads.htm
http://www.intel.com/technology/iapc/acpi/license2.htm

--
Who logs in to gdm? Not I, said the duck.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-03 Thread Henrique de Moraes Holschuh
On Tue, 03 Apr 2007, Jeremy Fitzhardinge wrote:
 Attached.  Is there some tool for decoding the DSDT?

iasl.  The documentation is the ACPI Specification.

  ezr:pts/1; cat /proc/acpi/ibm/thermal
  temperatures:   72 55 -128 65 40 -128 35 -128 51 53 -128 -128 -128 -128
  -128 -128
 
  This is a highly unusual output for thinkpads, but might be the expected one
  for your X60, the X-series has always been a bit weird.  I'd higly suggest
 
 How would you expect it to look?  I did some non-conclusive tests under

I would not expect a -128 on the third position.  The other two -128 are
expected, as they are the thermal sensors for the secondary battery.  There
is also one less sensor than I'd expect.

 Windows, and I'm beginning to get the feeling that there is actually a
 cooling problem with the hardware.

This must be at least the third complain I come across of a X60 which boils
the CPU.  The standard fix from Lenovo is a planar card swap (motherboard
swap).  Since this *does* mean they replace the thermal compounds, and a
full reassembly of the heat pipes, it might be that just fixing the thermal
coupling between cooling assembly and the CPU might do it.

 It doesn't seem to help.  When its failing to control cooling (temp
 creeps towards 100C while under load), its going at ~3700RPM, which is
 about what level 7 does.

Well, at least the EC is not misbehaving, then.

 What's a typical max RPM?  I'm getting the impression that there's

More than 4000rpm, in disengaged mode.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-02 Thread Rene Rebe
On Sunday 01 April 2007 20:57:57 Kyle Moffett wrote:
> On Mar 31, 2007, at 02:36:08, Jeremy Fitzhardinge wrote:
> > When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches,  
> > I've been getting my machine shut down with critical thermal  
> > shutdown messages:
> >
> > Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
> > Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
> > C), shutting down.
> > Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
> > C), shutting down.
> > Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system  
> > halt
> >
> > and the machine does feel pretty hot.  Interestingly, when the  
> > machine reboots, the fan spins up to a noticeably higher speed, so  
> > it seems that maybe something is getting fan speed control wrong.
> 
> Well, 128C is more than hot enough to boil water and well above the  
> thermal tolerances of most CPUs, so I would imagine that were your  
> CPU actually that hot it wouldn't be capable of printing the  
> "Critical temperature reached" messages, let alone properly rebooting.

IIRC a MSI Megabook S270 (I formerly owned) BIOS notifies this
"Critical temperature reached (128C)" when the battery run empty
when the OS did no action due to battery low indications. I guess
the BIOS people thought this is a good last resort to let the OS
really shutdown before the box just turns off.

Yours,

-- 
  René Rebe - ExactCODE GmbH - Europe, Germany, Berlin
  http://exactcode.de | http://t2-project.org | http://rene.rebe.name
  +49 (0)30 / 255 897 45
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-02 Thread Rene Rebe
On Sunday 01 April 2007 20:57:57 Kyle Moffett wrote:
 On Mar 31, 2007, at 02:36:08, Jeremy Fitzhardinge wrote:
  When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches,  
  I've been getting my machine shut down with critical thermal  
  shutdown messages:
 
  Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
  Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
  C), shutting down.
  Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
  C), shutting down.
  Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system  
  halt
 
  and the machine does feel pretty hot.  Interestingly, when the  
  machine reboots, the fan spins up to a noticeably higher speed, so  
  it seems that maybe something is getting fan speed control wrong.
 
 Well, 128C is more than hot enough to boil water and well above the  
 thermal tolerances of most CPUs, so I would imagine that were your  
 CPU actually that hot it wouldn't be capable of printing the  
 Critical temperature reached messages, let alone properly rebooting.

IIRC a MSI Megabook S270 (I formerly owned) BIOS notifies this
Critical temperature reached (128C) when the battery run empty
when the OS did no action due to battery low indications. I guess
the BIOS people thought this is a good last resort to let the OS
really shutdown before the box just turns off.

Yours,

-- 
  René Rebe - ExactCODE GmbH - Europe, Germany, Berlin
  http://exactcode.de | http://t2-project.org | http://rene.rebe.name
  +49 (0)30 / 255 897 45
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Jeremy Fitzhardinge
Henrique de Moraes Holschuh wrote:
> On Sun, 01 Apr 2007, Jeremy Fitzhardinge wrote:
>   
>> control problems.  Perhaps the ambient temperature was lower when I
>> reported success.
>> 
>
> You can use ibm-acpi to properly track your thinkpad thermal sensors, load
> it with the "experimental=1" parameter, and look at what gets exported at
> /proc/acpi/ibm/thermal.
>   

Interesting.  The first number corresponds with the ACPI THM0
temperature, but I can't see anything corresponding to THM1.  Is there
something that documents what all the temperatures are measuring in an
X60?  Thinkwiki doesn't seem to have any info.

ezr:pts/1; cat /proc/acpi/ibm/thermal
temperatures:   72 55 -128 65 40 -128 35 -128 51 53 -128 -128 -128 -128
-128 -128

> You can also use /proc/acpi/ibm/fan to check the fan's state.  And use the
> "level 7" /proc/acpi/ibm/fan command to set the emergency cooling level, and
> "level disengaged" command to set the really badass fan cooling level (might
> damage your hardware, we don't know if it is safe and IBM/Lenovo isn't
> talking).
>   

It's set to auto.  Presumably that means its tied into the temperature
sensors and will be able to keep the temp under control...

J

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Jeremy Fitzhardinge wrote:
> control problems.  Perhaps the ambient temperature was lower when I
> reported success.

You can use ibm-acpi to properly track your thinkpad thermal sensors, load
it with the "experimental=1" parameter, and look at what gets exported at
/proc/acpi/ibm/thermal.

You can also use /proc/acpi/ibm/fan to check the fan's state.  And use the
"level 7" /proc/acpi/ibm/fan command to set the emergency cooling level, and
"level disengaged" command to set the really badass fan cooling level (might
damage your hardware, we don't know if it is safe and IBM/Lenovo isn't
talking).

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Jeremy Fitzhardinge
Jeremy Fitzhardinge wrote:
> Alexey Starikovskiy wrote:
>   
>> Could you try to unload or disable hardware sensors and check if it
>> helps?
>> CONFIG_I2C=m
>> CONFIG_I2C_ALGOBIT=m
>> CONFIG_I2C_ALGOPCA=m
>> CONFIG_I2C_I810=m
>> CONFIG_I2C_PIIX4=m
>> CONFIG_SENSORS_DS1337=m
>> CONFIG_SENSORS_DS1374=m
>> CONFIG_SENSORS_EEPROM=m
>> CONFIG_SENSORS_PCF8574=m
>> CONFIG_SENSORS_PCA9539=m
>> CONFIG_SENSORS_PCF8591=m
>> CONFIG_SENSORS_MAX6875=m
>> 
>
> That seems to have helped.  If I watch
> /proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
> load.   I didn't try watching the thermal_zones when these options were
> enabled, but I presume the temperature was not controlled for it to hit
> 128 degC.

Hm, perhaps I was too optimistic.  I have lm_sensors disabled, and all
i2c options unconfigured in my kernel, but it still has temperature
control problems.  Perhaps the ambient temperature was lower when I
reported success.

When I do a big compile, the temperature reported in
/proc/acpi/thermal_zone/THM0/temperature rapidly approaches 100C, and
when it goes over 100 it triggers the critical shutdown.  When it shuts
down, it (mis-?)reports the temperature as 128C.

This seems to be real, and not a kernel artifact.  If I reboot the same
kernel immediately, it boots up to the message "ACPI: Core revision
20070126" and then hangs.  If I boot Windows immediately afterwards, it
reboots a short way into the boot process.

I've noticed one behavioral change with this kernel.  On the older
kernels, the CPU frequency would sometimes drop to lowest speed,
apparently because of an ACPI thermal limiting event.  This kernel
doesn't seem to drop speed.  I seem to remember Ingo had a patch to
ignore the ACPI thermal limits in cpufreq; did that get merged?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Pavel Machek wrote:
> ACPI is misdesigned, and lm_sensors can't cope with that.

Err, HOW exactly are you accessing the ThinkPad i2c buses directly? Or did
Lenovo change completely the hardware project of thinkpads in the X60?

Or did anyone add an lm-sensors that attach to the ACPI EC ports now?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Pavel Machek wrote:
> Are you running lm_sensors?

lm-sensors can't confuse any recent thinkpad's thermal management.  The i2c
buses that matter are all behind the EC, you have to ask the EC for data.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Jeremy Fitzhardinge
Kyle Moffett wrote:
> Well, 128C is more than hot enough to boil water and well above the
> thermal tolerances of most CPUs, so I would imagine that were your CPU
> actually that hot it wouldn't be capable of printing the "Critical
> temperature reached" messages, let alone properly rebooting.

Yes, its probably a bad reading, but its not complete absurd - chips can
operate up to ~100C, but they're definitely unhappy at that point.  In
fact, I typically get 85-95 degrees from those sensors in normal
operation, but I have no idea whether that's a real measurement or not.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Kyle Moffett

On Mar 31, 2007, at 02:36:08, Jeremy Fitzhardinge wrote:
When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches,  
I've been getting my machine shut down with critical thermal  
shutdown messages:


Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
C), shutting down.
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
C), shutting down.
Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system  
halt


and the machine does feel pretty hot.  Interestingly, when the  
machine reboots, the fan spins up to a noticeably higher speed, so  
it seems that maybe something is getting fan speed control wrong.


Well, 128C is more than hot enough to boil water and well above the  
thermal tolerances of most CPUs, so I would imagine that were your  
CPU actually that hot it wouldn't be capable of printing the  
"Critical temperature reached" messages, let alone properly rebooting.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Pavel Machek
Hi!

> > CONFIG_I2C=m
> > CONFIG_I2C_ALGOBIT=m
> > CONFIG_I2C_ALGOPCA=m
> > CONFIG_I2C_I810=m
> > CONFIG_I2C_PIIX4=m
> > CONFIG_SENSORS_DS1337=m
> > CONFIG_SENSORS_DS1374=m
> > CONFIG_SENSORS_EEPROM=m
> > CONFIG_SENSORS_PCF8574=m
> > CONFIG_SENSORS_PCA9539=m
> > CONFIG_SENSORS_PCF8591=m
> > CONFIG_SENSORS_MAX6875=m
> 
> That seems to have helped.  If I watch
> /proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
> load.   I didn't try watching the thermal_zones when these options were
> enabled, but I presume the temperature was not controlled for it to hit
> 128 degC.
> 
> What's going on here?  Does reading an i2c sensor from the kernel
> prevent something else from doing it?

ACPI is misdesigned, and lm_sensors can't cope with that.

One idea was to add 'big acpi lock' and make lm_sensors take it, too.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Pavel Machek
Hi!

> > When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've
> > been getting my machine shut down with critical thermal shutdown messages:
> 
> Hmm, don't think there's anything either in x86 that would touch this code.
> But can you double check with plain rc5? 
> 
> > Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
> > Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
> > shutting down.
> > Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
> > shutting down.
> > Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt
> > 
> > and the machine does feel pretty hot.
> 
> Pavel has been complaining about higher power consumption on his laptop versus
> .20 too.

Yep, sometimes it takes 30W instead of 12W... Anyway, this seems to
be measurement error. Notice how acpi claims 128C. I do not think cpu
can work at 128C and hardware should kill us before cpu is that hot.

Are you running lm_sensors?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Matthew Garrett
On Sat, Mar 31, 2007 at 11:28:46PM -0700, Jeremy Fitzhardinge wrote:

> That seems to have helped.  If I watch
> /proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
> load.   I didn't try watching the thermal_zones when these options were
> enabled, but I presume the temperature was not controlled for it to hit
> 128 degC.
> 
> What's going on here?  Does reading an i2c sensor from the kernel
> prevent something else from doing it?

The i2c drivers access the same hardware as the ACPI methods, and 
there's no locking.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Matthew Garrett
On Sat, Mar 31, 2007 at 11:28:46PM -0700, Jeremy Fitzhardinge wrote:

 That seems to have helped.  If I watch
 /proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
 load.   I didn't try watching the thermal_zones when these options were
 enabled, but I presume the temperature was not controlled for it to hit
 128 degC.
 
 What's going on here?  Does reading an i2c sensor from the kernel
 prevent something else from doing it?

The i2c drivers access the same hardware as the ACPI methods, and 
there's no locking.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Pavel Machek
Hi!

  When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've
  been getting my machine shut down with critical thermal shutdown messages:
 
 Hmm, don't think there's anything either in x86 that would touch this code.
 But can you double check with plain rc5? 
 
  Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
  Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
  shutting down.
  Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
  shutting down.
  Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt
  
  and the machine does feel pretty hot.
 
 Pavel has been complaining about higher power consumption on his laptop versus
 .20 too.

Yep, sometimes it takes 30W instead of 12W... Anyway, this seems to
be measurement error. Notice how acpi claims 128C. I do not think cpu
can work at 128C and hardware should kill us before cpu is that hot.

Are you running lm_sensors?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Pavel Machek
Hi!

  CONFIG_I2C=m
  CONFIG_I2C_ALGOBIT=m
  CONFIG_I2C_ALGOPCA=m
  CONFIG_I2C_I810=m
  CONFIG_I2C_PIIX4=m
  CONFIG_SENSORS_DS1337=m
  CONFIG_SENSORS_DS1374=m
  CONFIG_SENSORS_EEPROM=m
  CONFIG_SENSORS_PCF8574=m
  CONFIG_SENSORS_PCA9539=m
  CONFIG_SENSORS_PCF8591=m
  CONFIG_SENSORS_MAX6875=m
 
 That seems to have helped.  If I watch
 /proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
 load.   I didn't try watching the thermal_zones when these options were
 enabled, but I presume the temperature was not controlled for it to hit
 128 degC.
 
 What's going on here?  Does reading an i2c sensor from the kernel
 prevent something else from doing it?

ACPI is misdesigned, and lm_sensors can't cope with that.

One idea was to add 'big acpi lock' and make lm_sensors take it, too.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Kyle Moffett

On Mar 31, 2007, at 02:36:08, Jeremy Fitzhardinge wrote:
When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches,  
I've been getting my machine shut down with critical thermal  
shutdown messages:


Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
C), shutting down.
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128  
C), shutting down.
Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system  
halt


and the machine does feel pretty hot.  Interestingly, when the  
machine reboots, the fan spins up to a noticeably higher speed, so  
it seems that maybe something is getting fan speed control wrong.


Well, 128C is more than hot enough to boil water and well above the  
thermal tolerances of most CPUs, so I would imagine that were your  
CPU actually that hot it wouldn't be capable of printing the  
Critical temperature reached messages, let alone properly rebooting.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Jeremy Fitzhardinge
Kyle Moffett wrote:
 Well, 128C is more than hot enough to boil water and well above the
 thermal tolerances of most CPUs, so I would imagine that were your CPU
 actually that hot it wouldn't be capable of printing the Critical
 temperature reached messages, let alone properly rebooting.

Yes, its probably a bad reading, but its not complete absurd - chips can
operate up to ~100C, but they're definitely unhappy at that point.  In
fact, I typically get 85-95 degrees from those sensors in normal
operation, but I have no idea whether that's a real measurement or not.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Pavel Machek wrote:
 Are you running lm_sensors?

lm-sensors can't confuse any recent thinkpad's thermal management.  The i2c
buses that matter are all behind the EC, you have to ask the EC for data.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Pavel Machek wrote:
 ACPI is misdesigned, and lm_sensors can't cope with that.

Err, HOW exactly are you accessing the ThinkPad i2c buses directly? Or did
Lenovo change completely the hardware project of thinkpads in the X60?

Or did anyone add an lm-sensors that attach to the ACPI EC ports now?

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Jeremy Fitzhardinge
Jeremy Fitzhardinge wrote:
 Alexey Starikovskiy wrote:
   
 Could you try to unload or disable hardware sensors and check if it
 helps?
 CONFIG_I2C=m
 CONFIG_I2C_ALGOBIT=m
 CONFIG_I2C_ALGOPCA=m
 CONFIG_I2C_I810=m
 CONFIG_I2C_PIIX4=m
 CONFIG_SENSORS_DS1337=m
 CONFIG_SENSORS_DS1374=m
 CONFIG_SENSORS_EEPROM=m
 CONFIG_SENSORS_PCF8574=m
 CONFIG_SENSORS_PCA9539=m
 CONFIG_SENSORS_PCF8591=m
 CONFIG_SENSORS_MAX6875=m
 

 That seems to have helped.  If I watch
 /proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
 load.   I didn't try watching the thermal_zones when these options were
 enabled, but I presume the temperature was not controlled for it to hit
 128 degC.

Hm, perhaps I was too optimistic.  I have lm_sensors disabled, and all
i2c options unconfigured in my kernel, but it still has temperature
control problems.  Perhaps the ambient temperature was lower when I
reported success.

When I do a big compile, the temperature reported in
/proc/acpi/thermal_zone/THM0/temperature rapidly approaches 100C, and
when it goes over 100 it triggers the critical shutdown.  When it shuts
down, it (mis-?)reports the temperature as 128C.

This seems to be real, and not a kernel artifact.  If I reboot the same
kernel immediately, it boots up to the message ACPI: Core revision
20070126 and then hangs.  If I boot Windows immediately afterwards, it
reboots a short way into the boot process.

I've noticed one behavioral change with this kernel.  On the older
kernels, the CPU frequency would sometimes drop to lowest speed,
apparently because of an ACPI thermal limiting event.  This kernel
doesn't seem to drop speed.  I seem to remember Ingo had a patch to
ignore the ACPI thermal limits in cpufreq; did that get merged?

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Henrique de Moraes Holschuh
On Sun, 01 Apr 2007, Jeremy Fitzhardinge wrote:
 control problems.  Perhaps the ambient temperature was lower when I
 reported success.

You can use ibm-acpi to properly track your thinkpad thermal sensors, load
it with the experimental=1 parameter, and look at what gets exported at
/proc/acpi/ibm/thermal.

You can also use /proc/acpi/ibm/fan to check the fan's state.  And use the
level 7 /proc/acpi/ibm/fan command to set the emergency cooling level, and
level disengaged command to set the really badass fan cooling level (might
damage your hardware, we don't know if it is safe and IBM/Lenovo isn't
talking).

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-04-01 Thread Jeremy Fitzhardinge
Henrique de Moraes Holschuh wrote:
 On Sun, 01 Apr 2007, Jeremy Fitzhardinge wrote:
   
 control problems.  Perhaps the ambient temperature was lower when I
 reported success.
 

 You can use ibm-acpi to properly track your thinkpad thermal sensors, load
 it with the experimental=1 parameter, and look at what gets exported at
 /proc/acpi/ibm/thermal.
   

Interesting.  The first number corresponds with the ACPI THM0
temperature, but I can't see anything corresponding to THM1.  Is there
something that documents what all the temperatures are measuring in an
X60?  Thinkwiki doesn't seem to have any info.

ezr:pts/1; cat /proc/acpi/ibm/thermal
temperatures:   72 55 -128 65 40 -128 35 -128 51 53 -128 -128 -128 -128
-128 -128

 You can also use /proc/acpi/ibm/fan to check the fan's state.  And use the
 level 7 /proc/acpi/ibm/fan command to set the emergency cooling level, and
 level disengaged command to set the really badass fan cooling level (might
 damage your hardware, we don't know if it is safe and IBM/Lenovo isn't
 talking).
   

It's set to auto.  Presumably that means its tied into the temperature
sensors and will be able to keep the temp under control...

J

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-03-31 Thread Jeremy Fitzhardinge
Alexey Starikovskiy wrote:
> Could you try to unload or disable hardware sensors and check if it
> helps?
> CONFIG_I2C=m
> CONFIG_I2C_ALGOBIT=m
> CONFIG_I2C_ALGOPCA=m
> CONFIG_I2C_I810=m
> CONFIG_I2C_PIIX4=m
> CONFIG_SENSORS_DS1337=m
> CONFIG_SENSORS_DS1374=m
> CONFIG_SENSORS_EEPROM=m
> CONFIG_SENSORS_PCF8574=m
> CONFIG_SENSORS_PCA9539=m
> CONFIG_SENSORS_PCF8591=m
> CONFIG_SENSORS_MAX6875=m

That seems to have helped.  If I watch
/proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
load.   I didn't try watching the thermal_zones when these options were
enabled, but I presume the temperature was not controlled for it to hit
128 degC.

What's going on here?  Does reading an i2c sensor from the kernel
prevent something else from doing it?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-03-31 Thread Andi Kleen
On Saturday 31 March 2007 08:36, Jeremy Fitzhardinge wrote:
> When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've
> been getting my machine shut down with critical thermal shutdown messages:

Hmm, don't think there's anything either in x86 that would touch this code.
But can you double check with plain rc5? 

> Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
> Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
> shutting down.
> Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
> shutting down.
> Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt
> 
> and the machine does feel pretty hot.

Pavel has been complaining about higher power consumption on his laptop versus
.20 too.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-03-31 Thread Andi Kleen
On Saturday 31 March 2007 08:36, Jeremy Fitzhardinge wrote:
 When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've
 been getting my machine shut down with critical thermal shutdown messages:

Hmm, don't think there's anything either in x86 that would touch this code.
But can you double check with plain rc5? 

 Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
 Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
 shutting down.
 Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
 shutting down.
 Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt
 
 and the machine does feel pretty hot.

Pavel has been complaining about higher power consumption on his laptop versus
.20 too.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-03-31 Thread Jeremy Fitzhardinge
Alexey Starikovskiy wrote:
 Could you try to unload or disable hardware sensors and check if it
 helps?
 CONFIG_I2C=m
 CONFIG_I2C_ALGOBIT=m
 CONFIG_I2C_ALGOPCA=m
 CONFIG_I2C_I810=m
 CONFIG_I2C_PIIX4=m
 CONFIG_SENSORS_DS1337=m
 CONFIG_SENSORS_DS1374=m
 CONFIG_SENSORS_EEPROM=m
 CONFIG_SENSORS_PCF8574=m
 CONFIG_SENSORS_PCA9539=m
 CONFIG_SENSORS_PCF8591=m
 CONFIG_SENSORS_MAX6875=m

That seems to have helped.  If I watch
/proc/acpi/thermal_zone/THM?/temperature, it seems stable even under
load.   I didn't try watching the thermal_zones when these options were
enabled, but I presume the temperature was not controlled for it to hit
128 degC.

What's going on here?  Does reading an i2c sensor from the kernel
prevent something else from doing it?

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-03-30 Thread Jeremy Fitzhardinge
When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've
been getting my machine shut down with critical thermal shutdown messages:

Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
shutting down.
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
shutting down.
Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt

and the machine does feel pretty hot.  Interestingly, when the machine
reboots, the fan spins up to a noticeably higher speed, so it seems that
maybe something is getting fan speed control wrong.

The machine is a Thinkpad X60, with a 1.8GHz Core Duo.  I can run it
indefinitely with the FC6 2.6.20-1.2933.fc6 kernel, so I don't think
there's anything wrong with the hardware.  And it was sitting on a
desktop plugged into mains, so there's no problems with obstructed airflow.

I was running a normal email/browsing/editing/compiling workload, and I
don't think there was anything particularly CPU intensive running at the
time.  I run cpufreq with the conservative governor.

Running now with the FC6 kernel, I get:
: ezr:pts/2; cat /proc/acpi/thermal_zone/THM?/temperature
temperature: 69 C
temperature: 82 C


Config attached.

J
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION="-paravirt"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
CONFIG_PARAVIRT=y
CONFIG_VMI=y
CONFIG_MPENTIUMM=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_HPET_TIMER=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
CONFIG_X86_CPUID=m
CONFIG_EDD=m
CONFIG_HIGHMEM64G=y
CONFIG_PAGE_OFFSET=0xC000
CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_HIGHPTE=y
CONFIG_MATH_EMULATION=y
CONFIG_MTRR=y
CONFIG_IRQBALANCE=y
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_PHYSICAL_START=0x10
CONFIG_PHYSICAL_ALIGN=0x10
CONFIG_HOTPLUG_CPU=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION=""
CONFIG_SUSPEND_SMP=y
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_PROCFS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_IBM=m
CONFIG_ACPI_IBM_BAY=y
CONFIG_ACPI_BLACKLIST_YEAR=1999
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_SPEEDSTEP_CENTRINO=y
CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI=y
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y

2.6.21-rc5: Thinkpad X60 gets critical thermal shutdowns

2007-03-30 Thread Jeremy Fitzhardinge
When I run 2.6.21-rc5 + Andi's x86 patches + paravirt_ops patches, I've
been getting my machine shut down with critical thermal shutdown messages:

Mar 30 23:19:03 localhost kernel: ACPI: Critical trip point
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
shutting down.
Mar 30 23:19:03 localhost kernel: Critical temperature reached (128 C), 
shutting down.
Mar 30 23:19:03 localhost shutdown[19417]: shutting down for system halt

and the machine does feel pretty hot.  Interestingly, when the machine
reboots, the fan spins up to a noticeably higher speed, so it seems that
maybe something is getting fan speed control wrong.

The machine is a Thinkpad X60, with a 1.8GHz Core Duo.  I can run it
indefinitely with the FC6 2.6.20-1.2933.fc6 kernel, so I don't think
there's anything wrong with the hardware.  And it was sitting on a
desktop plugged into mains, so there's no problems with obstructed airflow.

I was running a normal email/browsing/editing/compiling workload, and I
don't think there was anything particularly CPU intensive running at the
time.  I run cpufreq with the conservative governor.

Running now with the FC6 kernel, I get:
: ezr:pts/2; cat /proc/acpi/thermal_zone/THM?/temperature
temperature: 69 C
temperature: 82 C


Config attached.

J
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=-paravirt
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
CONFIG_DEFAULT_IOSCHED=cfq
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
CONFIG_PARAVIRT=y
CONFIG_VMI=y
CONFIG_MPENTIUMM=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_HPET_TIMER=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
CONFIG_X86_CPUID=m
CONFIG_EDD=m
CONFIG_HIGHMEM64G=y
CONFIG_PAGE_OFFSET=0xC000
CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_HIGHPTE=y
CONFIG_MATH_EMULATION=y
CONFIG_MTRR=y
CONFIG_IRQBALANCE=y
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_PHYSICAL_START=0x10
CONFIG_PHYSICAL_ALIGN=0x10
CONFIG_HOTPLUG_CPU=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION=
CONFIG_SUSPEND_SMP=y
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_PROCFS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_IBM=m
CONFIG_ACPI_IBM_BAY=y
CONFIG_ACPI_BLACKLIST_YEAR=1999
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_SPEEDSTEP_CENTRINO=y
CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI=y
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y
CONFIG_PCI=y