> On 05/01/2013 08:51 AM, Christopher Samuel wrote: >> This sounds interesting.. >> >> http://www.theregister.co.uk/2013/05/01/amd_huma/
unfortunately, nothing much new there. we knew from other leaks that there would be systems with both normal ddr and gddr mapped into the same coherent physical space. that's really the point of the whole HSA thing, and goes along with the impending placement of ram chips onto the APU package by both AMD and Intel. this is a very good thing for HPC. it's obviously not hard to build even conventional SIMD CPU architectures that have serious bandwidth issues with conventional cache-mitigated dram. even without resorting to GPU-like wider arrays of ALUs. I don't see that it'll complicate much - just another flag to mmap (like MAP_HUGETLB). > > "In today's CPU-GPU computing schemes, when a CPU senses that a process > upon which it is working might benefit from a GPU's muscle, it has to ... > That last time I checked, CPUs don't sense anything - the programmer has > has to write the program to use the GPUs muscle. I'm shocked that anyone would accuse theregister of whimsey or anthropomorphization! though in a narrow sense, "sense" here could mean "application called ACML's FFT with appropriate size/params, so use GPU rather than CPU". anyway, here's my understanding of the state of things: - there will be a Haswell chip with in-package ram of some sort. it's extremely unclear what kind of ram, though - some people claim it's a custom 128M edram (not sure why 'e', since it's not embedded) there are also claims this acts as last-level cache. - AMD is making APU chips that will talk both ddr and gddr, the latter presumably on-chip. they'll be shipping significant volumes by way of xbox-next and ps4 consoles... - Nvidia has a plan for in-package dram as well, but it's years away. - HMC consortium (incl Micron, Samsung, IBM, but not Intel) has a standard that seems well-suited for integration via 2.5d interposer. from an HPC perspective, faster memory is unambiguously good, even if it's fixed in size, unupgradable, asymmetric. turning the GPU into more of a first-class on-chip functional unit will provide a much more managable programming model. _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
