Hi HPC experts,

     I seek your advise/suggestion to resolve a storage(NAS) server'
repeated hanging problem.

     We've a 23 nodes Rocks-5.1 HPC cluster. The Sun storage of capacity 12
TB is connected to a management server Sun Fire X4150 installed with RHEL
5.3 and this server is connected to a Gigabit switch which provides cluster
private network. The home directories on the cluster are NFS mounted from
storage partitions across all nodes including the master.

   This server gets hanged repeatedly. As an initial troubleshooting we
installed Ganglia, to check network utilization. But its normal. We're not
getting how to troubleshoot it and resolve the problem. Can anybode help us
resolve this issue?

Thanks,
Sangamesh
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to