Hi Jakub et al.,

        Did you get a chance to look at this _Cilk_for patch? 

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Iyer, Balaji V
> Sent: Friday, January 24, 2014 3:34 PM
> To: Jakub Jelinek
> Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'r...@redhat.com'
> Subject: RE: [PATCH] _Cilk_for for C and C++
> 
> 
> 
> > -----Original Message-----
> > From: Jakub Jelinek [mailto:ja...@redhat.com]
> > Sent: Friday, January 24, 2014 2:42 PM
> > To: Iyer, Balaji V
> > Cc: Jason Merrill; 'Jeff Law'; 'Aldy Hernandez';
> > 'gcc-patches@gcc.gnu.org'; 'r...@redhat.com'
> > Subject: Re: [PATCH] _Cilk_for for C and C++
> >
> > On Thu, Jan 23, 2014 at 04:38:53PM +0000, Iyer, Balaji V wrote:
> > >   This is how I started to think of it at first, but then when I
> > > thought
> > about it ... in _Cilk_for unlike the #pragma simd's for, the for
> > statement - not the body - (e.g. "_Cilk_for (int ii = 0; ii < 10;
> > ii++") doesn't really do anything nor does it belong in the child
> > function. It is really mostly used to calculate the loop count and capture
> step-size and starting point.
> > >
> > >   The child function has its own loop that will have a step size of 1
> > regardless of your step size. You use the step-size to find the correct 
> > spot.
> > Let me give you an example:
> > >
> > > _Cilk_for (int ii = 0; ii < 10; ii = ii  + 2) {
> > >   Array [ii] = 5;
> > > }
> > >
> > > This is translated to the following (assume grain is something that
> > > the user
> > input):
> > >
> > > data_ptr.start = 0;
> > > data_ptr.end = 10;
> > > data_ptr.step_size = 2;
> > > __cilkrts_cilk_for_32 (child_function, &data_ptr, (10-0)/2, grain);
> > >
> > > Child_function (void *data_ptr, int high, int low) {
> > >   for (xx = low; xx < high; xx++)
> > >    {
> > >           Tmp_var = (xx * data_ptr->step_size) + data_ptr->start;
> > >           // Note: if the _Cilk_for was (ii = 9; ii >= 0; ii -= 2), we 
> > > would
> > have something like this:
> > >           // Tmp_var = data_ptr->end - (xx * data_ptr->step_size)
> > >           // The for-loop above won't change.
> > >           Array[Tmp_var] = 5;
> > >   }
> > > }
> >
> > This isn't really much different from
> > #pragma omp parallel for schedule(runtime, N) (i.e. the combined
> > construct), when it is combined, we also don't emit a call to
> > GOMP_parallel but to some other function to which we pass the number
> > of iterations and chunk size (== grain in Cilk+ terminology), the only
> > (minor) difference is that for OpenMP when you handle the whole low ...
> > high range the child function doesn't exit, but calls a function to
> > give it next pari of low/high and only when that function tells it
> > there is no further work to do, it returns.  But, the Cilk+ case is
> > clearly the same thing with just implicit telling there is no further work 
> > in
> the current function.
> >
> > So, I'd strongly prefer if you swap the parallel with Cilk_for, just
> > set the flag that the two are combined like OpenMP already has for
> > tons of constructs, and during expansion you just treat it together.
> 
> Hi Jakub,
>       What you are suggesting here would require a significant rewrite of
> the code. This version of _Cilk_for works and it does share significant amount
> of work with OMP routines as requested by other GCC developers. Given
> the time constraints, let's try to get this version accepted so that the 
> feature
> will be available for the users and we will look into moving toward your
> suggestion when the phase 1 opens again.
> 
> Thanks,
> 
> Balaji V. Iyer.
> 
> 
> >
> >     Jakub

Reply via email to