Yeah, these are Intel Quad core processors and each node has 2 of them, making
it 8 cores per compute node.
I noticed that xplor processes are using only ~10-25% of each core while I
expected them to be running with full steam (atleast ~80% of each core).
Though I was aware of the non-linearity in scaling ( i wasn't expecting 8X
speed in processing), why should 8 different xplor processes be not using the 8
different processors fully ( though limited by memory bus) while other programs
(MPI as well as non-MPI) do utilize ~100% of each of the cores.
Anyways, do you suggest doing calculations in batches - like 12 or 24 (1 or 2
processors per node) at a time to maximize the resource utility ?
I'm curious about the 64 vs. 96 processor results: In one case you ran
on 8 processors and the other on 12? Or was there some other difference?
Yeah, you are right. In one case it was on 8 nodes (8*8 = 64 processors) and
other in 12 of them. But still, the difference was so much so that cannot be
explained simply by non-linear scaling !
This seems to really beat the purpose of parallelization,no ?! are there any
alternative suggestions ?!
Thanks and
Regards
[EMAIL PROTECTED] wrote: -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This may be of general interest:
From: [EMAIL PROTECTED]
To: Nah Sivar <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
Subject: Re: longer time taken with parallelization
Date: Mon, 05 May 2008 15:48:47 -0400
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello Nah--
> I am noticing a big performance issue when scaling up, albeit the fact
> that these just trivially parallel calculations with no inter-process
> communications. For ex., even with my PC, i can calculate say 2
> structures in 5 mts (same script with num_Strucures = 2) when trying
> the parallel version of it with 64 processors, 1000 structures
> calculation took 2.30 hrs (whereas it should have taken ~1 hr from an
> approximation of ~4mts per structure calculation) and interestingly
> with 96 processors and for 2000 structures it's taking close to 9hrs
> for the calculation !!! /cluster/softs/xplor-nih-2.19/bin/xplor -py
> -parallel -machines 96_procs -o mod.out mod_anneal1.py and the machine
> file 96_procs has 96 processors listed (since each machine has 8
> processors, i am repeating the machine name 8 times ; 12*8 =96
> processors)
>
I guess these are Intel 8 core processors. These things have one front
side bus, so they will not scale up to 8 cores at all (as you've
found). I suspect that running more than 2, maybe 3 Xplor-NIH processes
per CPU is a waste. Note that 4-core AMD chips and the upcoming Intel
processors with the QuickPath Interconnect bus should behave much
better. Your current results are a limitation of the chip
architecture. Your chips may have 8 cores, but you don't have memory
bandwidth to use half of them efficiently.
best regards--
Charles
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8+
iD8DBQFIH2SePK2zrJwS/lYRAoy2AJ4+zbugcbgWf6N+Y2VFhEQekKzZKwCeMoHy
UQROcgcls/xmHLplbdhoO70=
=ScZA
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8+
iD4DBQFIH2sSPK2zrJwS/lYRAtE4AJ9Ib3PGzvt838yAfdX+kAOjV0IC9ACYtVrp
mWJSx8nB5r59N6cD7UhA4Q==
=mPDR
-----END PGP SIGNATURE-----
_______________________________________________
Xplor-nih mailing list
[email protected]
http://dcb.cit.nih.gov/mailman/listinfo/xplor-nih
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
http://knowdeal.110mb.com
Best Web Hosting at Lunarpages.com
Coupon : besthostdeal
Yahoo! Mail - Your Free E-mail Service
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now._______________________________________________
Xplor-nih mailing list
[email protected]
http://dcb.cit.nih.gov/mailman/listinfo/xplor-nih