On Thu, May 4, 2017 at 11:04 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> > >> >Sure, I'm not questioning the patch, just wondering if we shouldn't >> >improve >> >store-merging further (we want to do it anyway for e.g. bitop adjacent >> >operations etc.). >> >> We definitely want to do that. It should also 'nicely' merge with bswap for >> gathering the load side of a piecewise memory to memory copy. > > The code we produce now in .optimized is: > <bb 8> [15.35%]: > # DEBUG this => _42 > MEM[(struct inline_summary *)_42].estimated_self_stack_size = 0; > MEM[(struct inline_summary *)_42].self_size = 0; > _44 = &MEM[(struct inline_summary *)_42].self_time; > # DEBUG this => _44 > # DEBUG sig => 0 > # DEBUG exp => 0 > MEM[(struct sreal *)_42 + 16B].m_sig = 0; > MEM[(struct sreal *)_42 + 16B].m_exp = 0; > sreal::normalize (_44); > # DEBUG this => NULL > # DEBUG sig => NULL > # DEBUG exp => NULL > MEM[(struct inline_summary *)_42].min_size = 0; > MEM[(struct inline_summary *)_42].inlinable = 0; > MEM[(struct inline_summary *)_42].contains_cilk_spawn = 0; > MEM[(struct inline_summary *)_42].single_caller = 0; > MEM[(struct inline_summary *)_42].fp_expressions = 0; > MEM[(struct inline_summary *)_42].estimated_stack_size = 0; > MEM[(struct inline_summary *)_42].stack_frame_offset = 0;
It should handle at least the bitfields here (inlinable, contains_cilk_spawn, single_caller and fp_expression). Ah, I guess it doesn't because the padding isn't initialized and it doesn't want to touch it. Well, just a guess. inline_summary is also very badly laid out now with lots of padding. > _45 = &MEM[(struct inline_summary *)_42].time; > # DEBUG this => _45 > # DEBUG sig => 0 > # DEBUG exp => 0 > MEM[(struct sreal *)_42 + 56B].m_sig = 0; > MEM[(struct sreal *)_42 + 56B].m_exp = 0; > sreal::normalize (_45); > # DEBUG this => NULL > # DEBUG sig => NULL > # DEBUG exp => NULL > MEM[(struct inline_summary *)_42].size = 0; > MEM[(void *)_42 + 80B] = 0; > MEM[(void *)_42 + 88B] = 0; > MEM[(void *)_42 + 96B] = 0; > MEM[(struct predicate * *)_42 + 104B] = 0; > MEM[(void *)_42 + 112B] = 0; > MEM[(void *)_42 + 120B] = 0; > goto <bb 10>; [100.00%] > > so indeed it is not quite pretty at all. Even the bitfields are split > and there is offlined call to normlize zero in sreal. > I wonder why inliner does not see htis as obviously good inlining oppurtunity. > I will look into that problem. For store merging it would indeed be > very nice to improve this as well. > > Honza >> >> Richard. >> >> > >> > Jakub