Someone last week asked about processor thermals and semiconductor reliability - the below is from a paper I put together on the thermal/acoustics tradeoffs for Intel Developer Forum a few years ago - first a little math (which I am hoping that everyone here will appreciate, and forgive the little bit off topic post).

The Arrhenius model states that long-term reliability of a semiconductor device degrades proportionally with increasing temperatures following the exponential function described by the kinetics of chemical reactions, known as the Arrhenius equation.

              (  Ea )
             -(-----)
              ( k*T )
r(t) = r0 * e

Arrhenius Equation

The rate of the process (r(t)) is equal to an initial failure rate (r0) times the negative of a constant known as the activation energy for a given process EA, (shown in table 3) divided by k (Boltzman� s constant, 8.6 x 10-5 eV / K) and the temperature in Kelvin (T). EA can be determined for a specific device to determine the composite activation energy for all of the device failure modes by accelerated life testing on a large sample of devices and long-term field failure statistics.

A list Activation Energy for different semiconductor failure modes

                            Approximate
Failure Mode                Activation  Energy (eV)
Contact electromigration    0.9
Contact metallurgy          0.9
Contamination               1.0 - 1.4
Diffusion                   0.9 - 1.5
DRAM charge loss            0.6
Electrolytic corrosion      0.3 - 0.6
Electromigration in Al      0.5 - 0.9
Gate oxide short            0.3
Metal Migration             0.9 - 1.8
Microcracks                 1.3
Oxide defects               0.3
Plastic chemistry           1.0
Polarization                1.0
Silicon defects             0.3 - 0.5
Surface charge              0.5 - 1.0
Wire electromigration       0.5

Most semiconductor manufactures do not disclose the composite activation energy for a specific device, as it is both highly predictive, and sometimes can be misleading.

To determine how much a temperature change effects reliability of a device, Arrhenius equation can be rearranged to determine the temperature acceleration factor (TAF). For example, by raising the temperature of a device from 25�C to 50�C, the Mean Time Between Failures (MTBF) due to electromigration in contacts (EA = 0.9 eV) decreases by approximately 15 times.


(Ea ) ( 1 1 ) (---)*(--- - ---) r(t)1 (k*T) (t1 t2) TAF = -------- = e r(t)2

Temperature Acceleration Factor
Equation 2

MTBF data for complex silicon products such as a high performance processors are typically not available. For example, while it is a fair assumption to say that as temperature of a processor
increases, its MTBF decreases. However, it is not possible to speculate how much increasing temperature changes the MTBF value, or the absolute value of the MTBF in either case.


The only written documentation that is made available to end users, is the semiconductor manufacturer's warranty, which varies manufacture to manufacture, device to device, but is
always rated for the worst case specifications. In Intel�s Boxed Processor Limited Warranty, it states:


Intel warrants that the processor, if properly used and
installed, will be free from defects in material and
workmanship and will substantially conform to Intel�s
publicly available specifications for a period of three
(3) years after the date the processor was purchased
(whether purchased separately or as part of a computer
system).

So, the short answer is - yes increasing temperature decreases processor reliability. How much - only Intel, AMD, and IBM really know - for their specific designs - and they don't share the information with the public at large.

However - if you are doing this on a machine built by someone else - you just need to ensure that you meet their specifications of how far away walls and air obstructions need to be. Design / sizing / placement of chassis / heatsinks / fans is the PC manufactures / assemblers responsibility). This is just a purchasing decision - if I run a thermally intensive load on machine X, is the thermal solution going to too loud for me to handle? If so, 1) return machine to the manufacture or place of purchase 2) go on to machine Y. If this seems like to much of a problem for you, ask your favorite magazine that reviews PCs to come up with a "standard" test for noise at high processor thermal load.

If you are building your own system, you - not the motherboard, heatsink, or chassis manufacture is responsible for meeting processor thermal specifications - which can be found in the processor datasheets.

-Robin

_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to