According to my experiences to run HPC applications in Shanghai Super Computing Center. Myrinet interconnection brought to many failure with even a small application. All users are crazy with the interconnections and we had to restart the applications once and once again.
I mean no slight of SSC users or admins, but to me this sounds like some sort of mistake. temperatures, bad cables, etc. our largest myrinet cluster is 287 nodes, versus ~512 for SSC's cluster, but we do have a greater number of total ports. I guess it's conceivable that SSC's problem is specific to their very large switch...
_______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
