Re: missed vectorization (was Some thoughts about steerring commitee work)

2007-06-18 Thread Dorit Nuzman
Tim Prince [EMAIL PROTECTED] wrote on 17/06/2007 19:47:10:

 [EMAIL PROTECTED] wrote:
  Tim Prince [EMAIL PROTECTED] wrote on 17/06/2007 04:15:56:
 
  [EMAIL PROTECTED] wrote:
  On Sat, Jun 16, 2007 at 06:54:46PM +0300, Dorit Nuzman wrote:
  There are quite a few known simple cases which vectorizer fails to
  vectorize.
  by known you mean there are open missed-optimization PRs for them?
  (if
  Yes, that is what I meant.
 
  I'd be happy to file some PRs along this line, if there is interest.
C
 
  yes, there is
 
  or C++, if there's more interest in that than in Fortran.  But,
gfortran
  fails to vectorize more than 50% of the stuff I run into every day,
  including most everything which involves distinct sections of the same
  array or COMMON block.
 
  I thought there was already a PR opened for this issue (probably by
Toon),
  but I can't find it :-(
 
  thanks,
  dorit
 
 There are several issues.  EQUIVALENCE produces such a problem (PR32373)
 as do various kinds of references to multiple sections of the same array
 (PR32375,32376,32377,32378,32379,32380).  Only 2 of those PRs involve
 actual source/destination overlap, where the vectorizer would have to
 choose the correct direction (loop reversed or not).
 In the bigger case (PR32380) there are loops which vectorize in
 isolation but not in the presence of other loops.


thanks for taking the time to extract the testcases and open the PRs. I
guess the discussion can continue in bugzilla now...

 There are existing PRs on a somewhat similar issue involving type
 casting in C. IMHO, not vectorizing those might seem excusable.


I think we should teach the vectorizer to handle those as well (another
issue I've been wanting to get to in a while...)

thanks,
dorit

 Thanks,
 Tim



Re: missed vectorization (was Some thoughts about steerring commitee work)

2007-06-17 Thread Tim Prince

[EMAIL PROTECTED] wrote:

Tim Prince [EMAIL PROTECTED] wrote on 17/06/2007 04:15:56:


[EMAIL PROTECTED] wrote:

On Sat, Jun 16, 2007 at 06:54:46PM +0300, Dorit Nuzman wrote:

There are quite a few known simple cases which vectorizer fails to
vectorize.

by known you mean there are open missed-optimization PRs for them?

(if

Yes, that is what I meant.


I'd be happy to file some PRs along this line, if there is interest.  C


yes, there is


or C++, if there's more interest in that than in Fortran.  But, gfortran
fails to vectorize more than 50% of the stuff I run into every day,
including most everything which involves distinct sections of the same
array or COMMON block.


I thought there was already a PR opened for this issue (probably by Toon),
but I can't find it :-(

thanks,
dorit

There are several issues.  EQUIVALENCE produces such a problem (PR32373) 
as do various kinds of references to multiple sections of the same array 
(PR32375,32376,32377,32378,32379,32380).  Only 2 of those PRs involve 
actual source/destination overlap, where the vectorizer would have to 
choose the correct direction (loop reversed or not).
In the bigger case (PR32380) there are loops which vectorize in 
isolation but not in the presence of other loops.


There are existing PRs on a somewhat similar issue involving type 
casting in C. IMHO, not vectorizing those might seem excusable.


Thanks,
Tim


Re: missed vectorization (was Some thoughts about steerring commitee work)

2007-06-17 Thread Janne Blomqvist

Tim Prince wrote:
There are several issues.  EQUIVALENCE produces such a problem (PR32373) 
as do various kinds of references to multiple sections of the same array 
(PR32375,32376,32377,32378,32379,32380).  Only 2 of those PRs involve 
actual source/destination overlap, where the vectorizer would have to 
choose the correct direction (loop reversed or not).
In the bigger case (PR32380) there are loops which vectorize in 
isolation but not in the presence of other loops.


Vectorization is tough work, and in the end if you succeed noone cares 
except for the crystallography weenies (and pipe stress freaks, if you 
catch my drift). ;-/


That being said, for the gfortran frontend there are a few things we can 
do to help the vectorizer:


1) Keep our data 16-byte aligned, this could help 32380?. For ALLOCATE 
we could use posix_memalign instead of malloc, if that is available. 
OTOH, AFAIK on x86-64 malloc returns 16-byte aligned so perhaps it's not 
worth bothering about. I'm not sure how to teach the middle-end about 
alignment, but I'm sure there is some way..


2) Annotate variables and procedure interfaces to help the optimizers. I 
think about the only thing we do ATM is declaring pure procedures (incl. 
intrinsics if I read the spaghetti correctly) with DECL_IS_PURE. See 
31094, 31593, 20165, 32131.


3) Better analysis of array syntax. Basically recognizing certain 
patterns and reorganizing the loops so that they can be vectorized. This 
is hard work with limited applicability, and perhaps it's not really 
needed, provided we do (2) well (allowing the middle-end to reorder 
loops if needed)?


--
Janne Blomqvist


Re: missed vectorization (was Some thoughts about steerring commitee work)

2007-06-17 Thread Tim Prince

[EMAIL PROTECTED] wrote:

Tim Prince wrote:
There are several issues.  EQUIVALENCE produces such a problem 
(PR32373) as do various kinds of references to multiple sections of 
the same array (PR32375,32376,32377,32378,32379,32380).  Only 2 of 
those PRs involve actual source/destination overlap, where the 
vectorizer would have to choose the correct direction (loop reversed 
or not).
In the bigger case (PR32380) there are loops which vectorize in 
isolation but not in the presence of other loops.


Vectorization is tough work, and in the end if you succeed noone cares 
except for the crystallography weenies (and pipe stress freaks, if you 
catch my drift). ;-/


That being said, for the gfortran frontend there are a few things we can 
do to help the vectorizer:


1) Keep our data 16-byte aligned, this could help 32380?. 
In the actual application 32380 is derived from, the initial index is 1 
(16-byte aligned) 99% of the time.  Some of those loops do vectorize 
with gfortran when taken in isolation.  As all the arrays are set the 
same addresses, modulo 32 bytes, any required alignment adjustments for 
the rare cases are the same for all.
Recent public statements indicated that applications like these account 
for 25% of the server business of the largest vendors (more than that 
for their competitors), and this fraction is growing.  While this may 
not fit those who use gcc compilers, once the effort has been made to 
support vectorization, it may be of interest to see whether the 
boundaries could be extended.