On Fri, 2007-09-28 at 17:43 -0300, Ivan Paganini wrote: > Hello everybody, > > I am beginning to take care of an IBM's JS21. The cluster consists of
> The myrinet connection was working right, but sometimes a user program > just got stuck - one of the processes was sleeping, and all others > were running. Then, the program hangs. > > Any suggestions? Contact Myricom support? BTW, if you are doing the debugging by yourself, start from the bottom. Take two machines, run mx_info, mx_endpoint (should be nothing if no programs running) and mx_counters. Then do your pingpong and further stress tests as in the README. _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
