Planned vectorization enhancements for 4.2:

1. Recognize reduction patterns (Dorit).
      Some computations have specialized target support and can be
vectorized more efficiently if the computation idiom is recognized  and
vectorized as a whole. This is especially true to idioms that involve
multiple types - multiple-types require packing/unpacking of vector
elements, unless the entire pattern is recognized. Examples for such
patterns are summation into a result wider the arguments ("widening sum"),
dot product, sum of absolute differences, and more. This project will
include (1) a pattern recognition engine, to be used for patterns that the
vectorizer can benefit from. (2) functions to recognize reduction patterns.
(3) extend the current reduction support to handle reduction patterns. (4)
more patterns that are not related to reduction, e.g. saturation.

* Delivery Date: Stage 1 of 4.2. Most of the above already implemented, and
most of that is already in autovect-branch.
* Benefits: More loops vectorized.

2. Vectorize interleaved data (Ira).
      Currently the vectorizer supports only computations with stride 1
(consecutive data elements). Some important computations access data with
stride other than 1 - for example complex data with the real and imaginary
parts interleaved - the stride in this case is 2. We want to extend the
vectorizer to support these computations. For that we will also need to
introduce new tree-codes/optabs.

* Delivery Date: Stage 2 of 4.2.
* Benefits: More loops vectorized.

3. Vectorize in the presence of multiple data types (Dorit).
      Currently the vectorizer supports loops that operate on a single data
type. In particular, the vectorizer doesn't support type casts, which in
vectorized form require packing/unpacking of data elements between vectors.
We want to extend the vectorizer to handle type conversions. This will
require introducing some of the new tree-codes/optabs we discussed last
year.

* Delivery Date: Stage 2 of 4.2
* Benefits: More loops vectorized.


Not sure when the rest of the items will be ready, and if they'll make it
for 4.2, but it's high on our todo list:

4. Vectorization of induction (Dorit).
      The vectorizer currently doesn't support vectorization of induction,
e.g. a[i] = i. We want to extend the vectorizer to handle such
computations. We already have some of the required steps implemented as
part of the reduction support.

* Delivery Date: unknown
* Benefits: More loops vectorized.

5. Versioning for aliasing (Dorit/Ira)
      It is often difficult/impossible to prove that two data-references in
the loop don't overlap (e.g. when they are accessed using pointers). It is
still possible to vectorize such loops using runtime dependence checks,
much like the runtime alignment checks that were recently committed to
mainline. I.e., use loop versioning and guard the vectorized version with a
runtime aliasing test.

* Delivery Date: unknown
* Benefits: More loops vectorized.

6. Cost model (Dorit/Ira).
      We are currently vectorizing whenever we can. This can often hurt
performance, for example, if the loop is very short, because of the
overheads involved in vectorization (e.g. alignment handling, loop peeling,
and epilog code for reduction). We also need the cost model to decide how
to vectorize - for example - there are different ways we can handle
alignment (versioning, peeling, misaligned vector accesses). We want to try
to estimate the costs involved in vectorization, and make a decision based
on that whether to vectorized or not, and how.

* Delivery Date: unknown
* Benefits: Improved performance when vectorizing.

7. Misaligned stores (Dorit/Ira).
      We currently don't handle misaligned stores. Instead we peel the loop
to force the alignment of the store. This works only for one misaligned
store; if there's more than one misaligned store and we can't prove that
all the stores in the loop have the same misalignment, we can't vectorize
the loop. We want to add the capability to vectorize misaligned stores.

* Delivery Date: unknown
* Benefits: More loops vectorized.

Personnel

    * Dorit Nuzman
    * Ira Rosen

Dependencies

    None.

Modifications Required

    All modifications are local to the vectorizer pass, except for adding
new tree-codes and optabs for the new patterns and misaligned stores.

dorit

Reply via email to