On Tue, Jan 8, 2019, 7:55 PM Ilia Mirkin <imir...@alum.mit.edu wrote: > On Tue, Jan 8, 2019 at 7:26 PM Marek Olšák <mar...@gmail.com> wrote: > > > > On Tue, Jan 8, 2019 at 7:18 PM Ilia Mirkin <imir...@alum.mit.edu> wrote: > >> > >> On Tue, Jan 8, 2019 at 6:21 PM Marek Olšák <mar...@gmail.com> wrote: > >> > > >> > On Tue, Jan 8, 2019 at 5:25 PM Ilia Mirkin <imir...@alum.mit.edu> > wrote: > >> >> > >> >> Why does this need to be in p_state? And who is responsible for > >> >> setting it (and how will it be set)? > >> > > >> > > >> > Oh right, there is a way to get it out of p_state.h if needed. > >> > > >> > It should be set to 0 by default. > >> > > >> > If your thread block is 8x8x1, but you need to launch 10x8x1 threads, > set partial_block = {2, 0, 0}. It will launch the following thread blocks: > >> > 8x8x1 > >> > 2x8x1 > >> > > >> > It's the same as launching 16x8x1 threads and doing this at the > beginning of the compute shader: > >> > if (globalThreadID.x >= 10) return; > >> > >> But that all sounds like something a state tracker wouldn't care > >> about, right? In e.g. GLSL you can specify the block to be 10x8x1 and > >> let the backend work it all out. Should st/mesa care about this (or > >> clover or whatever)? > > > > > > The block size should be a multiple of 64 on radeonsi to utilize all > SIMD lanes. If you want to launch 8192+1 threads with the block size of 64, > you need to launch 1 partial block with the block size of 1 at the end. > OpenGL can't do this. > > Ohhhhhhhhhh. So the partial-ness applies to the last-executed block. > If you have a local_size=(2,2,1), and you want your global grid to be, > say, (5,4,1), with unextended GL you might run it as groups = (3,2,1) > which would end up invoking a bunch of bits you don't want, and so > this partial_size is a way to say that you don't want the last "line" > to be executed at all. > > That makes sense, and seems like a reasonable thing to have in > pipe_grid_info. The documentation did not make that clear the first > time I read it, but now I'm having trouble suggesting improvements to > it. So I think it's fine. >
Yep. The partialness adds additional blocks to the grid with disabled threads (lanes). I might rename it to grid_padding[3]. Or I might keep the whole thing private in radeonsi. It's useful for e.g. compute-based image blits when the blit box is not aligned to the block size. Marek The p_state.h bits are Acked-by: Ilia Mirkin <imir...@alum.mit.edu> . > Can't speak as to the radeonsi bits. > > Cheers, > > -ilia >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev