Am 24.08.2014 23:51, schrieb Marek Olšák: > On Sun, Aug 24, 2014 at 6:15 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> On Sun, Aug 24, 2014 at 7:44 AM, Marek Olšák <mar...@gmail.com> wrote: >>> From: Marek Olšák <marek.ol...@amd.com> >>> >>> This fixes crashes if the number of temporaries is greater than 4096. >>> >>> Bugzilla: >>> https://urldefense.proofpoint.com/v1/url?u=https://bugs.freedesktop.org/show_bug.cgi?id%3D66184&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0A&m=Lj8duk%2FDvCQDjQhJ8Av6PGRyBFIFL%2FqLAlXqHOpJdu8%3D%0A&s=413450aa014f131fbcfa4b3b7261ad21ea65ed7cb1a7e89e65579d37728a91c3 >>> Cc: 10.2 10.3 mesa-sta...@lists.freedesktop.org >>> --- >>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 >>> ++++++++++++++++++------------ >>> 1 file changed, 27 insertions(+), 18 deletions(-) >>> >>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> index 84bdc4f..9beecf8 100644 >>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>> @@ -75,14 +75,6 @@ extern "C" { >>> (1 << PROGRAM_UNIFORM)) >>> >>> /** >>> - * Maximum number of temporary registers. >>> - * >>> - * It is too big for stack allocated arrays -- it will cause stack >>> overflow on >>> - * Windows and likely Mac OS X. >>> - */ >>> -#define MAX_TEMPS 4096 >>> - >>> -/** >>> * Maximum number of arrays >>> */ >>> #define MAX_ARRAYS 256 >>> @@ -3301,14 +3293,10 @@ get_src_arg_mask(st_dst_reg dst, st_src_reg src) >>> void >>> glsl_to_tgsi_visitor::simplify_cmp(void) >>> { >>> - unsigned *tempWrites; >>> + int tempWritesSize = 0; >>> + unsigned *tempWrites = NULL; >>> unsigned outputWrites[MAX_PROGRAM_OUTPUTS]; >>> >>> - tempWrites = new unsigned[MAX_TEMPS]; >>> - if (!tempWrites) { >>> - return; >>> - } >>> - memset(tempWrites, 0, sizeof(unsigned) * MAX_TEMPS); >>> memset(outputWrites, 0, sizeof(outputWrites)); >>> >>> foreach_in_list(glsl_to_tgsi_instruction, inst, &this->instructions) { >>> @@ -3330,7 +3318,16 @@ glsl_to_tgsi_visitor::simplify_cmp(void) >>> prevWriteMask = outputWrites[inst->dst.index]; >>> outputWrites[inst->dst.index] |= inst->dst.writemask; >>> } else if (inst->dst.file == PROGRAM_TEMPORARY) { >>> - assert(inst->dst.index < MAX_TEMPS); >>> + if (inst->dst.index >= tempWritesSize) { >>> + const int inc = 4096; >>> + >>> + tempWrites = (unsigned*) >>> + realloc(tempWrites, >>> + (tempWritesSize + inc) * >>> sizeof(unsigned)); >> >> This can fail, the old code dealt with new returning NULL. Not sure >> what the general policy for that is in mesa, but I have seen a lot of >> allocation checks. Probably makes sense here too. >> >>> + memset(tempWrites + tempWritesSize, 0, inc * sizeof(unsigned)); >>> + tempWritesSize += inc; >>> + } >>> + >>> prevWriteMask = tempWrites[inst->dst.index]; >>> tempWrites[inst->dst.index] |= inst->dst.writemask; >>> } else >>> @@ -3349,7 +3346,7 @@ glsl_to_tgsi_visitor::simplify_cmp(void) >>> } >>> } >>> >>> - delete [] tempWrites; >>> + free(tempWrites); >>> } >>> >>> /* Replaces all references to a temporary register index with another >>> index. */ >>> @@ -4158,7 +4155,9 @@ struct label { >>> struct st_translate { >>> struct ureg_program *ureg; >>> >>> - struct ureg_dst temps[MAX_TEMPS]; >>> + unsigned temps_size; >>> + struct ureg_dst *temps; >>> + >>> struct ureg_dst arrays[MAX_ARRAYS]; >>> struct ureg_src *constants; >>> struct ureg_src *immediates; >>> @@ -4299,7 +4298,16 @@ dst_register(struct st_translate *t, >>> return ureg_dst_undef(); >>> >>> case PROGRAM_TEMPORARY: >>> - assert(index < Elements(t->temps)); >>> + /* Allocate space for temporaries on demand. */ >>> + if (index >= t->temps_size) { >>> + const int inc = 4096; >>> + >>> + t->temps = (struct ureg_dst*) >>> + realloc(t->temps, >>> + (t->temps_size + inc) * sizeof(struct >>> ureg_dst)); >> >> Check alloc? >> >> With that fixed, >> >> Reviewed-by: Ilia Mirkin <imir...@alum.mit.edu> >> >> BTW, should this in any way be related to PIPE_SHADER_CAP_MAX_TEMPS? >> Looking at what drivers set this to, it's fairly small values. e.g. >> nvc0 returns 128. I guess there's limited spilling space. OTOH, it can >> do a *lot* better than TGSI at storing stuff due to optimizations, >> register reuse, etc. > > I think the limit only matters for ARB program queries. It should be > ignored by GLSL, because most drivers do register allocation anyway. > Most but not all drivers. For instance softpipe will not (it does advertize 4096 though - this is the d3d10 limit).
Roland _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev