On Wed, 2 Dec 2015, Nathan Sidwell wrote: > On 12/02/15 05:40, Jakub Jelinek wrote: > > Don't know the HW good enough, is there any power consumption, heat etc. > > difference between the two approaches? I mean does the HW consume different > > amount of power if only one thread in a warp executes code and the other > > threads in the same warp just jump around it, vs. having all threads busy? > > Having all threads busy will increase power consumption. >
Is that from general principles (i.e. "if it doesn't increase power consumption, the GPU is poorly optimized"), or is that based on specific knowledge on how existing GPUs operate (presumably reverse-engineered or privately communicated -- I've never seen any public statements on this point)? The only certain case I imagine is instructions that go to SFU rather than normal SPs -- but those are relatively rare. > It's also bad if the other vectors are executing memory access instructions. How so? The memory accesses are the same independent of whether you reading the same data from 1 thread or 32 synchronous threads. Alexander