On 03/03/2014 05:06 PM, Brice Goglin wrote:
Le 03/03/2014 23:02, Gus Correa a écrit :
I rebooted the node and ran hwloc-gather-topology again.
This turn it didn't throw any errors on the terminal window,
which may be a good sign.

[root@node14 ~]# hwloc-gather-topology /tmp/`date
+"%Y%m%d%H%M"`.$(uname -n)
Hierarchy gathered in /tmp/201403031639.node14.tar.bz2 and kept in
/tmp/tmp.FM97IQCCKc/201403031639.node14/
Expected topology output stored in /tmp/201403031639.node14.output

I attach the diagnostic files.
Was the problem fixed with the processor re-seating, or is it still
there?

Everything looks good now. Looks like the problem is gone. Something bad
happened somewhere before you repluged the processor, we'll never know
exactly what :)

Brice

Hi Brice

Reporting back to you that I ran the OMPI connectivity_c.c example on
node14, binding to core, and everything worked fine.
So, I am moving node14 back to production.

When I removed one of node14's processors from the socket,
I saw a sub-millimeter sized bit of dust, which I then blew away.
I am not sure if it was there already, or made it in when I
took the processor out.
In any case, that tubt but if dust is the only suspect
I have for causing the problem.
Computer rooms need to be vacuum cleaned.  Occasionally at least.  :)

Many thanks for your help.
This nowhere land between HW and SW is always a slippery road,
and I am glad that you guided me to a solution.

Regards,
Gus Correa

Reply via email to