Hi, sorry for replying so late, I still haven't recovered from two weeks of traveling and conferences.
On Sat, Sep 21 2019, Richard Sandiford wrote: > Hi, > > Thanks for doing this. > > Martin Jambor <mjam...@suse.cz> writes: >> +/* Analyze function body scan results stored in param_accesses and >> + param_accesses, detect possible transformations and store information of >> + those in function summary. NODE, FUN and IFS are all various structures >> + describing the currently analyzed function. */ >> + >> +static void >> +process_scan_results (cgraph_node *node, struct function *fun, >> + isra_func_summary *ifs, >> + vec<gensum_param_desc> *param_descriptions) >> +{ >> + bool check_pass_throughs = false; >> + bool dereferences_propagated = false; >> + tree parm = DECL_ARGUMENTS (node->decl); >> + unsigned param_count = param_descriptions->length(); >> + >> + for (unsigned desc_index = 0; >> + desc_index < param_count; >> + desc_index++, parm = DECL_CHAIN (parm)) >> + { >> + gensum_param_desc *desc = &(*param_descriptions)[desc_index]; >> + if (!desc->locally_unused && !desc->split_candidate) >> + continue; > > I'm jumping in the middle without working through the whole pass, > so this is probably a daft question sorry, but: what is this loop > required to do when: > > !desc->split_candidate && desc->locally_unused You have figured out correctly that this is a thinko. I meant not to continue for non-register-types which might not be used locally but their locally_unused flag is only set a few lines below... > > ? AFAICT... > >> + >> + if (flag_checking) >> + isra_verify_access_tree (desc->accesses); >> + >> + if (!dereferences_propagated >> + && desc->by_ref >> + && desc->accesses) >> + { >> + propagate_dereference_distances (fun); >> + dereferences_propagated = true; >> + } >> + >> + HOST_WIDE_INT nonarg_acc_size = 0; >> + bool only_calls = true; >> + bool check_failed = false; >> + >> + int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index; >> + for (gensum_param_access *acc = desc->accesses; >> + acc; >> + acc = acc->next_sibling) >> + if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls, >> + entry_bb_index)) >> + { >> + check_failed = true; >> + break; >> + } >> + if (check_failed) >> + continue; >> + >> + if (only_calls) >> + desc->locally_unused = true; ...specifically here. >> + >> + HOST_WIDE_INT cur_param_size >> + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm))); >> + HOST_WIDE_INT param_size_limit; >> + if (!desc->by_ref || optimize_function_for_size_p (fun)) >> + param_size_limit = cur_param_size; >> + else >> + param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR) >> + * cur_param_size); >> + if (nonarg_acc_size > param_size_limit >> + || (!desc->by_ref && nonarg_acc_size == param_size_limit)) >> + { >> + disqualify_split_candidate (desc, "Would result into a too big set of" >> + "replacements."); >> + } >> + else >> + { >> + /* create_parameter_descriptors makes sure unit sizes of all >> + candidate parameters fit unsigned integers restricted to >> + ISRA_ARG_SIZE_LIMIT. */ >> + desc->param_size_limit = param_size_limit / BITS_PER_UNIT; >> + desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT; >> + if (desc->split_candidate && desc->ptr_pt_count) >> + { >> + gcc_assert (desc->by_ref); >> + check_pass_throughs = true; >> + } >> + } >> + } > > ...disqualify_split_candidate should be a no-op in that case, > because we've already disqualified the parameter for a different reason. > So it looks like the main effect is instead to set up param_size_limit > and nonarg_acc_size, the latter of which I assume is 0 when > desc->locally_unused. This is the only bit where you are wrong, param_size_limit is the type size for aggregates and twice pointer size for pointers (well, actually PARAM_IPA_SRA_PTR_GROWTH_FACTOR times the size of a pointer). Even for locally unused parameters because we might "pull" some of them from callees, when there are some. But it is not really relevant for the problem you are facing. > > The reason for asking is that the final "else" says that we've already > checked that param_size_limit is in range, but that's only true if > desc->split_candidate. In particular: > > if (is_gimple_reg (parm) > && !isra_track_scalar_param_local_uses (fun, node, parm, num, > &scalar_call_uses)) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, " is a scalar with only %i call uses\n", > scalar_call_uses); > > desc->locally_unused = true; > desc->call_uses = scalar_call_uses; > } > > happens before: > > if (type_size == 0 > || type_size >= ISRA_ARG_SIZE_LIMIT) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, " not a candidate, has zero or huge size\n"); > continue; > } > > An uninteresting case in which we truncate the value is: > > typedef unsigned int big_vec __attribute__((vector_size (0x20000))); > void foo (big_vec x) {} > > and going further triggers an assert in the tree_to_uhwi (...): > > typedef unsigned int omg_vec __attribute__((vector_size (1ULL << 61))); > void foo (omg_vec x) {} > > but as you can guess, SVE is the real reason I'm asking. ;-) > > FWIW, I tried changing it to: > > gensum_param_desc *desc = &(*param_descriptions)[desc_index]; > if (!desc->split_candidate) > continue; > > and didn't get any test regressions (on aarch64-linux-gnu). It is the correct thing to do, sorry for the breakage. I have to run now but will prepare a patch tomorrow. Thanks, Martin