(resent because of mail issues on my end)
On Mon, 22 Oct 2018, Thomas Schwinge wrote:
I had a quick look at the difference, and a[j][i] remains in this form
throughout optimization. If I write instead *((*(a+j))+i) = 0; I get
j_10 = tmp_17 / 1025;
i_11 = tmp_17 % 1025;
_1 = (long unsi
On Mon, Oct 22, 2018 at 6:35 PM Thomas Schwinge wrote:
>
> Hi!
>
> Thanks for all your comments already! I continued looked into this for a
> bit (but then got interrupted by a higher-priority task). Regarding this
> one specifically:
>
> On Fri, 12 Oct 2018 21:14:11 +0200, Marc Glisse wrote:
>
Hi!
Thanks for all your comments already! I continued looked into this for a
bit (but then got interrupted by a higher-priority task). Regarding this
one specifically:
On Fri, 12 Oct 2018 21:14:11 +0200, Marc Glisse wrote:
> On Fri, 12 Oct 2018, Thomas Schwinge wrote:
>
> > Hmm, and without a
On Fri, Oct 12, 2018 at 2:14 PM Marc Glisse wrote:
> On Fri, 12 Oct 2018, Thomas Schwinge wrote:
>
> > Hmm, and without any OpenACC/OpenMP etc., actually the same problem is
> > also present when running the following code through the vectorizer:
> >
> >for (int tmp = 0; tmp < N_J * N_I; ++tm
On Mon, Oct 15, 2018 at 11:45 AM Jakub Jelinek wrote:
>
> On Mon, Oct 15, 2018 at 11:30:56AM +0200, Richard Biener wrote:
> > But isn't _actual_ collapsing an implementation detail?
>
> No, it is required by the standard and in many cases it is very much
> observable.
> #pragma omp parallel for sc
On Mon, Oct 15, 2018 at 11:30:56AM +0200, Richard Biener wrote:
> But isn't _actual_ collapsing an implementation detail?
No, it is required by the standard and in many cases it is very much
observable.
#pragma omp parallel for schedule(nonmonotonic: static, 23) collapse (2)
for (int i = 0; i < 64
On Mon, Oct 15, 2018 at 11:11 AM Jakub Jelinek wrote:
>
> On Mon, Oct 15, 2018 at 10:55:26AM +0200, Richard Biener wrote:
> > Yeah. Note this still makes the IVs not analyzable since i now effectively
> > becomes wrapping in the inner loop. For some special values we might
> > get away with a wr
On Mon, Oct 15, 2018 at 10:55:26AM +0200, Richard Biener wrote:
> Yeah. Note this still makes the IVs not analyzable since i now effectively
> becomes wrapping in the inner loop. For some special values we might
> get away with a wrapping CHREC in a bit-precision type but we cannot
> represent wr
On Fri, Oct 12, 2018 at 9:52 PM Jakub Jelinek wrote:
>
> On Fri, Oct 12, 2018 at 07:35:09PM +0200, Thomas Schwinge wrote:
> > int a[NJ][NI];
> >
> > #pragma acc loop collapse(2)
> > for (int j = 0; j < N_J; ++j)
> > for (int i = 0; i < N_I; ++i)
> > a[j][i] = 0;
>
> For e
On Fri, Oct 12, 2018 at 07:35:09PM +0200, Thomas Schwinge wrote:
> int a[NJ][NI];
>
> #pragma acc loop collapse(2)
> for (int j = 0; j < N_J; ++j)
> for (int i = 0; i < N_I; ++i)
> a[j][i] = 0;
For e.g.
int a[128][128];
void
foo (int m, int n)
{
#pragma omp for simd c
On Fri, 12 Oct 2018, Thomas Schwinge wrote:
Hmm, and without any OpenACC/OpenMP etc., actually the same problem is
also present when running the following code through the vectorizer:
for (int tmp = 0; tmp < N_J * N_I; ++tmp)
{
int j = tmp / N_I;
int i = tmp % N_I;
Hi!
I'm for the first time looking into the existing vectorization
functionality in GCC (yay!), and with that I'm also for the first time
encountering GCC's scalar evolution (scev) machinery (yay!), and the
chains of recurrences (chrec) used by that (yay!).
Obviously, I'm right now doing my own r
12 matches
Mail list logo