just in case not everyone is on the PAPER list... I've shut down the GPU boxes, which I think are the predominant source of heat. After 10 minutes or so it looks to maybe have leveled off at ~56C
On Wed, Dec 17, 2014 at 8:12 AM, danny jacobs <[email protected]> wrote: > > Here are the sensor readings from the xbox rack. I believe temp 1 samples > near the floor and 2 near the top of the rack. > > Whatever is happening, its happening gradually. We had a 50C alarm on > the hot sensor last night. I suggest that if its not an obvious fix we > suspend observing for tonight; the x boxes will just fall over otherwise. > > On Wed, Dec 17, 2014 at 4:14 AM, William Walbrugh <[email protected]> > wrote: >> >> Hi Dave, >> >> Seems something is a bit off - I'm copying Matthys and Sky in on this. >> >> Matthys/Sky could you please check up on the PAPER container HVAC? It >> seems as if it is not performing optimally. Please check fans on condenser >> outside, internal airflow, refrigerant level and any fault codes etc. >> >> Thanks and regards, >> William >> On 17 Dec 2014 10:21 AM, "David MacMahon" <[email protected]> wrote: >> >>> Hi, William, >>> >>> I've been seeing more than the usual number of automated status messages >>> with marginal values. Thinking this might be a canary in the coal mine, I >>> plotted the temperatures logged from the IPMI interfaces of the >>> correlator's X boxes. This is not to be confused with "TMON" data, but I'd >>> be curious to see what that shows as well. >>> >>> The attached plot shows the "Peripheral Temp" readings for each X box >>> (px1 through px8) for the last ~70 days. The plots clearly show a daily 12 >>> hour swing in temperature that corresponds to when the X engines are >>> actually correlating/integrating (i.e. when the GPUs are in use). This >>> pattern has been quite stable until the past few days when the readings >>> started getting higher and higher each day. >>> >>> Can your please arrange for a general checkup of the PAPER container? >>> I'm going to keep things running the same for now, but will be ready to >>> shut them down if needed. I'm guessing a coolant leak in the chiller, but >>> that's just a guess... >>> >>> Thanks and happy holidays!!! >>> >>> Cheers, >>> Dave >>> >>> > > -- > > National Science Foundation Fellow > Arizona State University > School of Earth and Space Exploration > Low Frequency Cosmology > Phone: (505) 500 4521 > Homepage: http://loco.lab.asu.edu/danny_jacobs/ > -- National Science Foundation Fellow Arizona State University School of Earth and Space Exploration Low Frequency Cosmology Phone: (505) 500 4521 Homepage: http://loco.lab.asu.edu/danny_jacobs/
