Re: cpufreqd as standard install?

John Moser Sat, 03 Mar 2012 00:17:12 -0800

On 03/03/2012 12:13 AM, Phillip Susi wrote:

On 02/29/2012 04:40 PM, John Moser wrote:
At full load (encoding a video), it eventually reaches 80C and the
system shuts down.
It sounds like you have some broken hardware. The stock heatsink andfan are designed to keep the cpu from overheating under full load atthe design frequency and voltage. You might want to verify that yourmotherboard is driving the cpu at the correct frequency and voltage.


Possibly.

The only other use case I can think of is when ambient temperature ishot. Remember server rooms use air conditioning; I did find that for awhile my machine would quickly overheat if the room temperature wasabove 79F, and so kept the room at 75F. The heat sink was completelyclogged with dust at the time, though, which is why I recently cleanedand inspected it and checked all the fan speed monitors and motherboardsettings to make sure everything was running as appropriate.

In any case if the A/C goes down in a server room, it would be nice tohave the system CPU frequency scaling kick in and take the clock speeddown before the chip overheats. Modern servers--for example, the newrevision of the Dell PowerEdge II and III as per 4 or 5 years ago--leanon their low-power capabilities, and modern data centers use acentralized DC converter and high voltage (220V) DC mains in the datacenter to reduce power waste because of the high cost of electricity.It's extremely likely that said servers would provide a low enough clockspeed to not overheat without air conditioning, which is an emergencysituation.

Of course, the side benefit of not overheating desktops with inadequatecooling or faulty motherboard behavior is simply a bonus. Still, Ibelieve in fault tolerance.

I currently have cpufreqd configured to clock to 1.8GHz at 73C, and move
to the ondemand governor at 70C.
This need for manual configuring is a good reason why it is not acandidate for standard install.

I've attached a configuration that generically uses sensors (i.e. if theprogram 'sensors' gives useful output, this works). It's just one corethough (a multi-core system reads the same temperature for them all, asit's per-CPU); you can easily automatically generate this.

Mind you on the topic of automatic generation, 80C is a hard limit. Itjust is. My machine reports (through sensors) +95.0C as "Critical", butmy BIOS shuts down the system at +80.0C immediately. Silicon physicallydoes not tolerate temperatures above 80.0C well at all; if a chip claimsit can run at 95.0C it's lying. Even SOD-CMOS doesn't tolerate thosetemperatures.

As well, again, you could write some generic profiles that detect whenthe system is running on battery (UPS, laptop) and make appreciableadjustments based on how much battery life is left.

At 73C, the system switches from 1.9GHz to 1.8GHz. Ten seconds later,
it's at 70C and switches back to 1.9GHz. 41 seconds after that, it
reaches 73C again and switches to 1.8GHz.

That means at stock frequency (1.9GHz) with stock cooling equipment, the
CPU overheats under full load. Clocked 0.1GHz slower than its rated
speed, it rapidly cools. Which is ridiculous; who designed this thing?

This sounds like your motherboard is overvolting the cpu in that 1.9GHz stepping.

Possibly, but the settings are all default, nothing set to overclock (ithas jumper free overclocking configuration, but the option "Standard" isdefault for clock rate and voltage settings, which I assume the CPUsupplies).

Basically the argument here is between "Supply fault tolerance" and"Well your motherboard is [old|poorly designed] so buy a new one."That's an excellent argument for hard drives (I have, in fact, suggestedin the past that Ubuntu monitor hard disks for behavior indicative ofdying drives--SMART errors, IDE RESET commands because the drive hangs,etc--and begin annoying the user with messages about the SEVERE risk ofextreme data loss if he doesn't back up his data), but really if mymobo/CPU is aging and the CPU runs a little hot I'm not going to crywhen the CPU suddenly burns out and my machine shuts down. I'll beconfused, annoyed, but I'll buy a new one--I might buy an entire newcomputer, unaware that just my CPU is broken, and shove the hard drivein there. So there's no harm in allowing the user's hardware to goahead and burn itself out if you think that's what's going on here.

By all means that doesn't mean you can't have a diagnostic centersomewhere that the user can review and see the whole collection."Ethernet: Lots of garbage [Possibly: Faulty switch, faulty NIC,another computer with a chattering NIC spewing packets]." "CPU:Overheats under high CPU load [Possibly: Dust-clogged CPU heat sink,failing CPU fan, overclocking, failing CPU, failing motherboard voltageregulators, buggy motherboard BIOS]." "/!\ Hard drive: Freezes andneeds IDE Resets [Possibly: Dying hard drive/!\, dying IDE controller,dying RAID controller] /!\WARNING: SEVERE DATA LOSS POSSIBLE". Etc.Looks like you really need a new computer...

Yes I have strange ideas about what a computer should and shouldn't do.But then, you know, people run huge racks of computers that failcatastrophically if you don't pipe an air conditioning line straight tothe chassis fan intake (take a look under the cabinet, the floor tiledirectly under each server rack is perforated--the raised floor has A/Cpumped under it and it vents directly and exclusively into the servercabinets).

# this is a comment
# see CPUFREQD.CONF(5) manpage for a complete reference
#
# Note: ondemand/conservative Profiles are disabled because
#       they are not available on many platforms.

[General]
pidfile=/var/run/cpufreqd.pid
poll_interval=0.2
verbosity=4
#enable_remote=1
#remote_group=root
[/General]

[Profile]
name=Standard
minfreq=0%
maxfreq=100%
policy=ondemand
[/Profile]

[Profile]
name=Hot
minfreq=50%
maxfreq=95%
policy=ondemand
[/Profile]

[Profile]
name=Overheating
minfreq=0%
maxfreq=10%
policy=ondemand
[/Profile]

##
# Basic states
##
[Rule]
name=Normal
#acpi_temperature=0-70
sensor=temp1:0-70
#cpu_interval=00-100
profile=Standard
[/Rule]

##
# Special Rules
##
# CPU Too hot!
[Rule]
name=CPU Hot
#acpi_temperature=4-5
sensor=temp1:73-76
#cpu_interval=00-100
profile=Hot
[/Rule]

[Rule]
name=CPU Too Hot
#acpi_temperature=50-100
sensor=temp1:76-100
#cpu_interval=00-100
profile=Overheating
[/Rule]

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss

Re: cpufreqd as standard install?

Reply via email to