https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69921
Bug ID: 69921 Summary: Switch OpenACC kernels number of gangs from "decide at run time" to "decide at compile time" Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: openacc Severity: minor Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: nathan at gcc dot gnu.org, vries at gcc dot gnu.org Target Milestone: --- Follow-up to r233634. See <http://news.gmane.org/find-root.php?message_id=%3C87bn7v4b0m.fsf%40kepler.schwinge.homeip.net%3E>. For OpenACC kernels, gcc/tree-parloops.c:create_parallel_loop currently sets: OMP_CLAUSE_NUM_GANGS_EXPR (clause) = n_threads; ..., so to zero ("decide at run time"). | Originally, I want to use: | | OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, n_threads == 0 ? -1 : n_threads); | | ... to store -1 "have the compiler decidew" (instead of now 0 "have the | run-time decide", which might prevent some code optimizations, as I | understand it) for the n_threads == 0 case, but it seems that for an | offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is | called with the parameter "used" set to 0 instead of "gang", and then the | "Default anything left to 1 or a partitioned default" logic will default | dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the | oacc_default_dims["gang"] (that is, 32). Nathan, does that smell like a | bug (and could you look into that)?