I surely don't know the problem, but can anyone tell me (or point me to...) how "unlimited" stacksize works? Peter
On 8/18/08, Mikhail Kuzminsky <[EMAIL PROTECTED]> wrote: > > I ran a set of HPC Challenge benchmarks on ONE dual socket quad-core > Opteron2350 (Rev. B3) based server (8 logical CPUs). > RAM size is 16 Gbytes. The tests performed were under SuSE 10.3/x86-64, for > LAM MPI 7.1.4 and MPICH 1.2.7 from SuSE distribution, using Atlas 3.9. > Unfortunately there is only one such cluster node, and I can't reproduce the > run on another node :-( > > For N (matrix size) up to 10000 all looks OK. But for more large N > (15000/20000/...) hpcc execution (mpirun -np 8 hpcc) leads to Linux hang-up. > > In the "top" output I see 8 hpcc examplars each eating about 100% of CPU, > and reasonable amounts of virtual and RSS memory per hpcc process, and the > absense of swap using. Usually there is no PTRANS results in hpccoutf.txt > results file, but in a few cases (when I "activelly looked" to hpcc > execution by means of ps/top issuing) I see reasonable PTRANS results but > absense of HPLinpack results. One time I obtained PTRANS, HPL and DGEMM > results for N=20000, but hangup later - on STREAM tests. May be it's simple > because of absense (at hangup) of final writing of output buffer to output > file on HDD. > > One of possible reasons of hang-ups is memory hardware problem, but what is > about possible software reasons of hangups ? > The hpcc executable is 64-bit dynamically linked. /etc/security/limits.conf > is empty. stacksize limit (for user issuing mpirun) is "unlimited", main > memory limit - about 14 GB, virtual memory limit - about 30 GB. Atlas was > compiled for 32-bit integers, but it's enough for such N values. Even > /proc/sys/kernel/shmmax is 2^63-1. > > What else may be the reason of hangup ? > > Mikhail Kuzminskiy > Computer Assistance to Chemical Research Center > Zelinsky Institute of Organic Chemistry > Moscow > > > _______________________________________________ > Beowulf mailing list, [email protected] > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf >
_______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
