On Thu, 15 Jul 2010 13:03:31 -0400, Jeff Squyres <jsquy...@cisco.com> wrote: > Given the oversubscription on the existing HT links, could contention > account for the difference? (I have no idea how HT's contention > management works) Meaning: if the stars line up in a given run, you > could end up with very little/no contention and you get good > bandwidth. But if there's a bit of jitter, you could end up with > quite a bit of contention that ends up cascading into a bunch of > additional delay.
What contention? Many sockets needing to access memory on another socket via HT links? Then yes, perhaps that could be a lot. As show in the diagram, it's pretty non-uniform, and if, say sockets 0, 1, and 3 all found memory on socket 0 (say socket 2 had local memory), then there are two ways for messages to get from 3 to 0 (via 1 or via 2). I don't know if there is hardware support to re-route to avoid contention, but if not, then socket 3 could be sharing the 1->0 HT link (which has max throughput of 8 GB/s, therefore 4 GB/s would be available per socket, provided it was still operating at peak). Note that this 4 GB/s is still less than splitting the 10.7 GB/s three ways. > I fail to see how that could add up to 70-80 (or more) seconds of > difference -- 13 secs vs. 90+ seconds (and more), though... 70-80 > seconds sounds like an IO delay -- perhaps paging due to the ramdisk > or somesuch...? That's a SWAG. This problem should have had significantly less resident than would cause paging, but these were very short jobs so a relatively small amount of paging would cause a big performance hit. We have also seen up to a factor of 10 variability in longer jobs (e.g. 1 hour for a "fast" run), with larger working sets, but once the pages are faulted, this kernel (2.6.18 from RHEL5) won't migrate them around, so even if you eventually swap out all the ramdisk, pages faulted before and after will be mapped to all sorts of inconvenient places. But, I don't have any systematic testing with a guaranteed clean ramdisk, and I'm not going to overanalyze the extra factors when there's an understood factor of 3 hanging in the way. I'll give an update if there is any news. Jed