On 7/19/21, 9:12 AM, "Beowulf on behalf of Prentice Bisbal via Beowulf"
<[email protected] on behalf of [email protected]> wrote:
Doug,
<snip>
I know they there is a direct relationship between system failure and
operating temperature, but I don't know if that applies to all
components, or just those with moving parts. Someone somewhere must
have done research on this. I know Google did research on hard drive
failure that was pretty popular. I would imagine they would have
researched this, too.
In general, it follows the Arrhenius relationship with some TBD exponent. 10C
rise ages twice as fast is a common rule of thumb.
There's all sorts of background physics to this - drift of metallization and
doping , radiation accumulation, etc.,etc.
Cycling is a different failure mechanism, and there it's propagation of
microscopic defects with each cycle, as well as the more obvious "cracks in
solder/PWB trace" kind of thing. One of the big issues today is the difference
in CTE between the chips (or their packages) and the PWB. Column and Grid
arrays that are soldered in have an issue with the corner pins/balls/columns
being stressed more than the sides, and any time you have cyclic stress, you
have the prospect of work hardening and micro crack propagation. Sockets with
interposers do help with this, because they allow changing misalignment without
failure. OTOH, now you have a socket and interposer, which can fail.
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf