> > > >Sure, I'm not questioning the patch, just wondering if we shouldn't > >improve > >store-merging further (we want to do it anyway for e.g. bitop adjacent > >operations etc.). > > We definitely want to do that. It should also 'nicely' merge with bswap for > gathering the load side of a piecewise memory to memory copy.
The code we produce now in .optimized is: <bb 8> [15.35%]: # DEBUG this => _42 MEM[(struct inline_summary *)_42].estimated_self_stack_size = 0; MEM[(struct inline_summary *)_42].self_size = 0; _44 = &MEM[(struct inline_summary *)_42].self_time; # DEBUG this => _44 # DEBUG sig => 0 # DEBUG exp => 0 MEM[(struct sreal *)_42 + 16B].m_sig = 0; MEM[(struct sreal *)_42 + 16B].m_exp = 0; sreal::normalize (_44); # DEBUG this => NULL # DEBUG sig => NULL # DEBUG exp => NULL MEM[(struct inline_summary *)_42].min_size = 0; MEM[(struct inline_summary *)_42].inlinable = 0; MEM[(struct inline_summary *)_42].contains_cilk_spawn = 0; MEM[(struct inline_summary *)_42].single_caller = 0; MEM[(struct inline_summary *)_42].fp_expressions = 0; MEM[(struct inline_summary *)_42].estimated_stack_size = 0; MEM[(struct inline_summary *)_42].stack_frame_offset = 0; _45 = &MEM[(struct inline_summary *)_42].time; # DEBUG this => _45 # DEBUG sig => 0 # DEBUG exp => 0 MEM[(struct sreal *)_42 + 56B].m_sig = 0; MEM[(struct sreal *)_42 + 56B].m_exp = 0; sreal::normalize (_45); # DEBUG this => NULL # DEBUG sig => NULL # DEBUG exp => NULL MEM[(struct inline_summary *)_42].size = 0; MEM[(void *)_42 + 80B] = 0; MEM[(void *)_42 + 88B] = 0; MEM[(void *)_42 + 96B] = 0; MEM[(struct predicate * *)_42 + 104B] = 0; MEM[(void *)_42 + 112B] = 0; MEM[(void *)_42 + 120B] = 0; goto <bb 10>; [100.00%] so indeed it is not quite pretty at all. Even the bitfields are split and there is offlined call to normlize zero in sreal. I wonder why inliner does not see htis as obviously good inlining oppurtunity. I will look into that problem. For store merging it would indeed be very nice to improve this as well. Honza > > Richard. > > > > > Jakub