Hi,
In loop vectorization, I found that vectorizer insists on loop peeling even our
target supports misaligned memory access. This results in much bigger code size
for a very simple loop. I defined TARGET_VECTORIZE_SUPPORT_VECTOR_MISALGINMENT
and also TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST to make misaligned
accesses almost as cheap as an aligned one. But the vectorizer still does
peeling anyway.
In vect_enhance_data_refs_alignment function, it seems that result of
vect_supportable_dr_alignment is not used in decision of whether to do peeling.
supportable_dr_alignment = vect_supportable_dr_alignment (dr, true);
do_peeling = vector_alignment_reachable_p (dr);
Later on, there is code to compare load/store costs. But it only decides
whether to do peeling for load or store, not whether to do peeling.
Currently I have a workaround. For the following simple loop, the size is
80bytes vs. 352 bytes without patch (-O2 -ftree-vectorize gcc 4.8.3 20131114)
int A[100];
int B[100];
void foo2() {
int i;
for (i = 0; i < 100; ++i)
A[i] = B[i] + 100;
}
What is the best way to tell vectorizer not to do peeling in such situation?
Thanks,
Bingfeng Mei
Broadcom UK