On Sat, May 17, 2014 at 12:32 AM, Jeff Law <l...@redhat.com> wrote: > On 05/16/14 04:07, Bin.Cheng wrote: > >> Yes, I think this one does have a good reason. The target independent >> pass just makes sure that two consecutive memory access instructions >> are free of data-dependency with each other, then feeds it to back-end >> hook. It's back-end's responsibility to generate correct instruction. > > But given these two memory access insns, there's only a couple ways they're > likely to combine into a single insn. We could just as easily have the > target independent code construct a new insn then try to recognize it. If > it's not recognized, then try the other way. > > Or is it the case that we're doing something beyond upsizing the mode? > > > >> It's not about modifying an existing insn then recognize it, it's >> about creating new instruction sometimes. For example, we can >> generate a simple move insn in Arm mode, while have to generate a >> parallel instruction in Thumb mode. Target independent part has no >> idea how to generate an expected insn. Moreover, back-end may check >> some special conditions too. > > But can't you go through movXX to generate either the simple insn on the ARM > or the PARALLEL on the thumb? > Yes, I think it's more than upsizing the mode. There is another example from one of x86's candidate peephole patch at https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00467.html
The patch wants to do below transformation, which I think is very target dependent. +(define_peephole2 + [(set (match_operand:DF 0 "register_operand") + (match_operand:DF 1 "memory_operand")) + (set (match_operand:V2DF 2 "register_operand") + (vec_concat:V2DF (match_dup 0) + (match_operand:DF 3 "memory_operand")))] + "TARGET_SSE_UNALIGNED_LOAD_OPTIMAL + && REGNO (operands[0]) == REGNO (operands[2]) + && adjacent_mem_locations (operands[1], operands[3])" + [(set (match_dup 2) + (unspec:V2DF [(match_dup 4)] UNSPEC_LOADU))] + +;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_STORE_OPTIMAL +;; is true. +(define_peephole2 + [(set (match_operand:DF 0 "memory_operand") + (vec_select:DF (match_operand:V2DF 1 "register_operand") + (parallel [(const_int 0)]))) + (set (match_operand:DF 2 "memory_operand") + (vec_select:DF (match_dup 1) + (parallel [(const_int 1)])))] + "TARGET_SSE_UNALIGNED_STORE_OPTIMAL + && adjacent_mem_locations (operands[0], operands[2])" + [(set (match_dup 3) + (unspec:V2DF [(match_dup 1)] UNSPEC_STOREU))] Thanks, bin -- Best Regards.