> On Aug 19, 2021, at 8:54 AM, Qing Zhao via Gcc-patches
> <[email protected]> wrote:
>
>>>>> Breakpoint 1, expand_DEFERRED_INIT (stmt=0x7fffe96ae348) at
>>>>> ../../latest-gcc/gcc/internal-fn.c:3021
>>>>> 3021 mark_addressable (lhs);
>>>>> (gdb) call debug_tree(lhs)
>>>>> <ssa_name 0x7fffe9584e58
>>>>> type <real_type 0x7fffe959b2a0 float sizes-gimplified SF
>>>>> size <integer_cst 0x7fffe9579f48 constant 32>
>>>>> unit-size <integer_cst 0x7fffe9579f60 constant 4>
>>>>> align:32 warn_if_not_align:0 symtab:0 alias-set 2 canonical-type
>>>>> 0x7fffe959b2a0 precision:32
>>>>> pointer_to_this <pointer_type 0x7fffe959b7e0>>
>>>>> visited var <var_decl 0x7ffff7ff7bd0 temp1>
>>>>> def_stmt temp1_5 = .DEFERRED_INIT (4, 2, 0, &"temp1"[0]);
>>>>> version:5>
>>>>>
>>>>> when I deleted:
>>>>>
>>>>> if (TREE_CODE (lhs) == SSA_NAME
>>>>> lhs = SSA_NAME_VAR (lhs);
>>>>
>>>> but then using SSA_NAME_VAR is broken. I suspect use_register_for_decl
>>>> isn't the correct thing to look at. I think we need to look at what
>>>> the LHS expanded to if it is a SSA_VAR_P (that includes SSA names
>>>> but also plain DECLs but not what we get from VLAs where we'd see
>>>> *ptr). So sth like
>>>>
>>>> bool reg_lhs;
>>>> if (SSA_VAR_P (lhs))
>>>> {
>>>> rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>>>> reg_lhs = !MEM_P (tem);
>>>> /* If not MEM_P reg_lhs should be REG_P or SUBREG_P (but maybe
>>>> also CONCAT or lowpart...?) */
>>>> }
>>>> else
>>>> {
>>>> gcc_assert (is_vla);
>>>> reg_lhs = false;
>>>> }
>>>>
>>>> if (!reg_lhs)
>>>> memset path
>>>> else
>>>> expand_assignment path
>>>
>>> After making the following change:
>>>
>>> + bool reg_lhs = true;
>>>
>>> tree var_type = TREE_TYPE (lhs);
>>> gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
>>>
>>> - if (is_vla || (!use_register_for_decl (lhs)))
>>> + if (SSA_VAR_P (lhs))
>>> + {
>>> + rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>>> + reg_lhs = !MEM_P (tem);
>>> + }
>>> + else
>>> + {
>>> + gcc_assert (is_vla);
>>> + reg_lhs = false;
>>> + }
>>> +
>>> + if (!reg_lhs)
>>> {
>>>
>>> I got exactly the same internal error that failed at expr.c:
>>>
>>> 8436 /* We must have made progress. */
>>> 8437 gcc_assert (inner != exp);
>>>
>>>
>>> Looks like for the following code:
>>>
>>> 3026 if (!reg_lhs)
>>> 3027 {
>>> 3028 /* If this is a VLA or the variable is not in register,
>>> 3029 expand to a memset to initialize it. */
>>> 3030 mark_addressable (lhs);
>>> 3031 tree var_addr = build_fold_addr_expr (lhs);
>>> 3032
>>> 3033 tree value = (init_type == AUTO_INIT_PATTERN) ?
>>> 3034 build_int_cst (integer_type_node,
>>> 3035 INIT_PATTERN_VALUE) :
>>> 3036 integer_zero_node;
>>> 3037 tree m_call = build_call_expr (builtin_decl_implicit
>>> (BUILT_IN_MEMSET),
>>> 3038 3, var_addr, value, var_size);
>>> 3039 /* Expand this memset call. */
>>> 3040 expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type));
>>> 3041 }
>>>
>>> At line 3030, “lhs” could be a SSA_NAME.
>>>
>>> My questions are:
>>>
>>> 1. Could the routine “mark_addressable” and “build_fold_addr_expr” be
>>> applied on SSA_NAME?
>>
>> No.
>>
>>> 2. Could the routine “expand_builtin_memset” be applied on the memset call
>>> whose “DEST” is
>>> an address expression on SSA_NAME?
>>
>> No.
>>
>>> 3. Within “expand_DEFERRED_INIT”, can I call “expand_builtin_memset” to
>>> expand .DEFERRED_INIT?
>>
>> Well, not with "invalid" GENERIC I fear (address of a SSA name).
>>
>>> I suspect that one of the above 3 might be the issue, but not sure which
>>> one?
>>
>> All of the above ;) So while reg_lhs is now precise as to how the
>> variable will end up (the SSA name will end up as a stack variable in this
>> case, for whatever reason), expansion via memcpy only works when
>> working on the RTL representation. The usual "workaround" (ugh)
>> is to use make_tree (), so in the !reg_lhs path you'd do
>>
>> /* Get a new GENERIC representation for the RTL. That's necesary
>> in case LHS is an SSA name. */
>> lhs = make_tree (TREE_TYPE (lhs), tem);
>
> This resolved the issue.
However, there was two runtime errors with CPU2017 with this change.
I guess that there might be some issue with “expand_expr” + “make_tree” here.
So, I changed the code to use the alternative solution that you suggested:
>
>>
>> alternatively you could maybe do
>>
>> if (DECL_P (lhs))
>> {
>> + rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>> + reg_lhs = !MEM_P (tem);
>> }
>> else if (TREE_CODE (lhs) == SSA_NAME)
>> reg_lhs = true;
>> else
>> reg_lhs = false;
>>
>> thus treat SSA names as register storage always (even if it will end
>> up on the stack).
This did resolve all the issues including CPU2017 runtime errors.
With this solution, all SSA_NAMEs will be expanded through “expand_assignment”
even though they are in stack.
the generated code will be correct. But the performance might be a little bit
different than through memset.
Qing
>
> My question here, for a complicate structure SSA_NAME, will expanding through
> memset better than expand_asssignment?
>
> Qing
>>