[Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Severity|normal |enhancement Keywords||missed-optimization Last reconfirmed||2021-07-20 Status|UNCONFIRMED |NEW --- Comment #4 from Andrew Pinski --- Confirmed. Still happens on the trunk.
[Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org --- I guess the cast prevents this from being handled by maybe_set_nonzero_bits, guess it could be handled there. That said, it is extremely fragile, because we insert the range and non-zero bits info on SSA_NAMEs and have this single exception for function parameters if they aren't used anywhere before the __builtin_unreachable check. As soon as e.g. the function is inlined, there might be more uses and the info can be lost. Richard didn't want to disable forward propagation if some SSA_NAME holds a useful range info which the to be propagated SSA_NAME does not hold (in that case, we'd keep a new SSA_NAME with the more precise range/non-zero info around and be able to stick it somewhere). The reason why we have __builtin_assume_aligned defined the way it is is that there is always an SSA_NAME to stick that info to, it is clear in which part of the function the condition is true.
[Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202 --- Comment #2 from Richard Biener rguenth at gcc dot gnu.org --- Well, as with restrict it would be nice to be able to annotate the memory references themselves with alignment info. Btw, a possibility would be to insert assume_aligned calls into the IL from the if (p 15) __builtin_unreachable (); pattern and remove the test __builtin_unreachable (). Of course quite special and breaks down for assume (!(p 15) a == b). As Jakub said, the testcase can be handled with the existing code as there is no use of p before the conditional. Note that there isn't an extra loop for the unaligned case but the extra loop is for the case where there is aliasing between p and b. But yes, we fail to use aligned loads here (but movdqu doesn't have a penalty for that).
[Bug tree-optimization/63202] tree vectorizer does not make use of alignment information from VRP/CCP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63202 --- Comment #3 from Andi Kleen andi-gcc at firstfloor dot org --- I'm not sure rewriting the pattern to assume_aligned would be useful. After all the user could already use assume_aligned directly. I was more thinking of cases when VRP/CCP can prove alignment in other ways from the code, and the vectorizer should use that. Good point that the fallback is not for unalignment. Should probably use a more fancy test case where unalignment matters for the cost model. One interesting case is avoiding the need for tail code when the iteration is not a multiple of the vector length.