just in case not everyone is on the PAPER list...
I've shut down the GPU boxes, which I think are the predominant source of
heat. After 10 minutes or so it looks to maybe have leveled off at ~56C

On Wed, Dec 17, 2014 at 8:12 AM, danny jacobs <[email protected]>
wrote:
>
> Here are the sensor readings from the xbox rack. I believe temp 1 samples
> near the floor and 2 near the top of the rack.
>
>  Whatever is happening, its happening gradually.  We had a 50C alarm on
> the hot sensor last night. I suggest that if its not an obvious fix we
> suspend observing for tonight; the x boxes will just fall over otherwise.
>
> On Wed, Dec 17, 2014 at 4:14 AM, William Walbrugh <[email protected]>
> wrote:
>>
>> Hi Dave,
>>
>> Seems something is a bit off - I'm copying Matthys and Sky in on this.
>>
>> Matthys/Sky could you please check up on the PAPER container HVAC? It
>> seems as if it is not performing optimally. Please check fans on condenser
>> outside, internal airflow, refrigerant level and any fault codes etc.
>>
>> Thanks and regards,
>> William
>> On 17 Dec 2014 10:21 AM, "David MacMahon" <[email protected]> wrote:
>>
>>> Hi, William,
>>>
>>> I've been seeing more than the usual number of automated status messages
>>> with marginal values.  Thinking this might be a canary in the coal mine, I
>>> plotted the temperatures logged from the IPMI interfaces of the
>>> correlator's X boxes.  This is not to be confused with "TMON" data, but I'd
>>> be curious to see what that shows as well.
>>>
>>> The attached plot shows the "Peripheral Temp" readings for each X box
>>> (px1 through px8) for the last ~70 days.  The plots clearly show a daily 12
>>> hour swing in temperature that corresponds to when the X engines are
>>> actually correlating/integrating (i.e. when the GPUs are in use).  This
>>> pattern has been quite stable until the past few days when the readings
>>> started getting higher and higher each day.
>>>
>>> Can your please arrange for a general checkup of the PAPER container?
>>> I'm going to keep things running the same for now, but will be ready to
>>> shut them down if needed.  I'm guessing a coolant leak in the chiller, but
>>> that's just a guess...
>>>
>>> Thanks and happy holidays!!!
>>>
>>> Cheers,
>>> Dave
>>>
>>>
>
> --
>
> National Science Foundation Fellow
> Arizona State University
> School of Earth and Space Exploration
> Low Frequency Cosmology
> Phone:           (505) 500 4521
> Homepage:     http://loco.lab.asu.edu/danny_jacobs/
>


-- 

National Science Foundation Fellow
Arizona State University
School of Earth and Space Exploration
Low Frequency Cosmology
Phone:           (505) 500 4521
Homepage:     http://loco.lab.asu.edu/danny_jacobs/

Reply via email to