On 08/03/2018 05:37 PM, Cesar Philippidis wrote:
>> But I still see no rationale why blocks is used here, and I wonder
>> whether something like num_gangs = grids * 64 would give similar results.

> My original intent was to keep the load proportional to the block size.
> So, in the case were a block size is limited by shared-memory or the
> register file capacity, the runtime wouldn't excessively over assign
> gangs to the multiprocessor units if their state is going to be swapped
> out even more than necessary.

So, that's your rationale. Please add a comment describing this.

Thanks,
- Tom

Reply via email to