http://bugzilla.kernel.org/show_bug.cgi?id=13573

           Summary: ACPI: Unable to turn cooling device 'on'
                    (Quadcore-AMD64, Ubuntu64)
           Product: ACPI
           Version: 2.5
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: BIOS
        AssignedTo: acpi_b...@kernel-bugs.osdl.org
        ReportedBy: alois.schlo...@tugraz.at
                CC: tr...@suse.de, yakui.z...@intel.com
        Regression: No


About 1 hour after I started a heavy computing batch  (using matlab 7.6,
distributing the load on all 4 core), the machine suddenly shutdown. kern.log
shows this message: 

Jun 18 14:32:48 bcipc038 kernel: [1117042.573713] ACPI Exception
(thermal-0479): AE_ERROR, ACPI thermal trip point state changed
Jun 18 14:32:50 bcipc038 kernel: [1117042.573717] Please send acpidump to
linux-a...@vger.kernel.org
Jun 18 14:32:50 bcipc038 kernel: [1117042.573719]  [20080926]
Jun 18 14:32:50 bcipc038 kernel: [1117042.574046] ACPI: Critical trip point
Jun 18 14:32:50 bcipc038 kernel: [1117042.574072] Critical temperature reached
(72 C), shutting down.
Jun 18 14:32:50 bcipc038 kernel: [1117042.574098] ACPI: Unable to turn cooling
device [ffff88012f815a60] 'on'
Jun 18 14:32:58 bcipc038 kernel: [1117048.576698] Critical temperature reached
(58 C), shutting down.
Jun 18 14:32:58 bcipc038 kernel: [1117049.920186] [drm] Resetting GPU
Jun 18 14:32:58 bcipc038 kernel: [1117050.517645] mtrr: MTRR 5 not used
Jun 18 14:57:23 bcipc038 kernel: Inspecting /boot/System.map-2.6.28-11-generic

The same behavior was observed on the same machine a few month ago and was
reported here

  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/314001

and here 

   http://marc.info/?l=linux-acpi&m=123120299000668&w=1


The problem went away, or I did not have time trying to reproduce the problem. 


Based on the previous feedback in 
http://marc.info/?l=linux-acpi&m=123120299000668&w=2 , I attach 
  acpidump.dump  
  dmesg.dump  
  dmidecode.dump  
  kern.log.20090619181656

However, I've no idea how to try the boot option of "acpi.power_nocheck=1".  


The problem happened actually two times (see kern.log):  

Jun 18 14:32:48 bcipc038 kernel: [1117042.573713] ACPI Exception
(thermal-0479): AE_ERROR, ACPI thermal trip point state changed

Jun 18 16:22:19 bcipc038 kernel: [ 5402.772605] ACPI Exception (thermal-0479):
AE_ERROR, ACPI thermal trip point state changed

Therefore, it seems to be reproducable again. I noticed also this message (4
times) in (kern.log): 

Jun 18 16:32:46 bcipc038 kernel: [    4.263055] [Firmware Bug]: powernow-k8:
Your BIOS does not provide ACPI _PSS objects in a way that Linux understands.
Please report this to the Linux ACPI maintainers and complain to your BIOS
vendor.


Is this important? What can I do about this ? I've no experience with the
kernel/acpi internal issues. Let me know if I can do anything to track down
this issue. 

Alois

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
acpi-bugzilla mailing list
acpi-bugzilla@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla

Reply via email to