> Jason R. and John,
> Was the roach running a particularly intensive design at the time
> around the failure? Just wondering why this part would be failing. Is
> the current limit somehow being exceeded?

We don't know about the first one, because it came to us from Socorro, but
the second roach was being used to test the tutorials, so I don't think it
was particularly heavily loaded.

I had a thought that we should check the serial numbers and see if they
are from the same batch.  Maybe some bad parts or ESD damage?

John

> Thanks,
> Glenn
>
> On Tue, Jun 19, 2012 at 9:52 AM, Jason Ray <j...@nrao.edu> wrote:
>> The first time I was troubleshooting this problem, I did see a fault on
>> the
>> 1V supply with roach_monitor.py.  I didn't check roach_monitor.py on the
>> second roach because the problem was so fresh in our mind we just jumped
>> to
>> the finish line and checked the mosfet with a meter, then replaced it.
>>
>> For reference, the part in question is Q13 (FD6675BZ).
>>
>> Thanks,
>> Jason
>>
>>
>>
>> At 09:33 AM 6/19/2012, Jason Manley wrote:
>>>
>>> Good sleuthing!
>>>
>>> FWIW, roach_monitor.py is supposed to be able to pull the log out of
>>> the
>>> Actel Fusion, which should have logged a fault on the 1V rail before
>>> shutting-down the board. This should work independent of PPC or dmesg
>>> states. I'm afraid I have little faith in the Fusion/Xport combo to
>>> reliably
>>> catch these issues, but it has helped me a few times.
>>>
>>> If it works, it only retrieves the reason for the last shutdown, so
>>> you'll
>>> have to plug a laptop into the Xport to query it directly after it
>>> self-shutdown.
>>>
>>> Jason
>>>
>>> On 19 Jun 2012, at 15:23, John Ford wrote:
>>>
>>> > Hi all.  We've had a couple of ROACH failures with identical causes.
>>> > Maybe some of you have seen this, but it's worth keeping in mind in
>>> case
>>> > you have a problem.
>>> >
>>> > The symptom is that the ROACH would sort of power on, but then turn
>>> off
>>> > spontaneously.  On one, as soon as the bof was loaded the roach would
>>> > turn
>>> > off.  The other one would come on for a brief few seconds and then
>>> turn
>>> > or, or it would cycle on and off.  The monitor readout in dmesg gave
>>> > non-sense readings.
>>> >
>>> > In any event, the cause was traced to the +1 volt supply MOSFET
>>> switch.
>>> > Replacing that mosfet fixed both roaches.  Kudos to Jason Ray for
>>> > finding
>>> > the problem originally.
>>> >
>>> > John
>>> >
>>> >
>>> >
>>
>>
>>
>



Reply via email to