OK. Reviewed-by: Marek Olšák <marek.ol...@amd.com>
Marek On Mon, May 9, 2016 at 10:41 PM, Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> wrote: > On Mon, May 9, 2016 at 9:55 PM, Marek Olšák <mar...@gmail.com> wrote: >> On Sun, May 8, 2016 at 1:00 PM, Bas Nieuwenhuizen >> <b...@basnieuwenhuizen.nl> wrote: >>> The calculated limit gave problems on SI as it was > 32 KiB >>> and the hardware LDS size on SI is only 32 KiB. It isn't >>> correct anyway when processing multiple patches in a threadgroup. >>> >>> As we potentially have any number of patches such that the >>> used LDS is at most the hardware LDS size, and exact size >>> per patch is not known at compile time, this seems like >>> the only valid bound. >> >> I think NIcolai sent out a similar patch. > > Yes, I've sent this patch as a reply. >> >> Would tessellation break if we set the size to 0 or a similar small >> number? It seems like seemingly high LDS usage can trick LLVM now or >> in the future into thinking that it doesn't have to schedule >> instructions for low register usage. > > The problem with smaller sizes than total LDS is that we have no > guarantuee that it will be placed at the same position for all shader > stages, and LLVM could introduce new LDS variables (see the promote > alloca pass which promotes alloca to LDS memory, although it currently > has several wrong assumptions for non-compute shaders anyway[1]). > > The LDS usage is per threadgroup though, so using the entire LDS does > not put a limit on the number of waves that are in the threadgroup. > Furthermore in the graphics pipeline we have to consider that even > though the current shader has a limit on the number of active waves > due to LDS there might be other stages that do not use LDS and could > use the registers. > > Note that neither too small nor too large (as long as it fits in LDS) > is likely to lead to problems at this moment. > > - Bas > > [1]: > - It assumes all LDS can be used. However VS as VS can't use any LDS > at all, and PS has to leave some for the inputs. > - It assumes that the max threadgroup size for non-compute shaders is > a single wave. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev