Jakub Jelinek <ja...@redhat.com> wrote: >On Thu, Aug 15, 2013 at 05:36:39PM +0400, Ilya Verbin wrote: >> 2. The second question, regarding #pragma omp target implementation. >> I'm going to reuse LTO approach in a prototype, that will produce 2 >> binaries - for host and target architectures. Target binary will >contain >> functions outlined from omp target region and some infrastructure to >run >> them. >> To produce 2 binaries we need to run gcc and ld twice. At the first >run >> gcc will generate object file, that contains optimized code for host >and >> GIMPLE for target. At the second run gcc will read the GIMPLE and >> generate optimized code for target. >> >> So, the question is - what is the right place for the second run of >gcc >> and ld? Should I insert them into liblto_plugin.so? Or should I >create >> entirely new plugin, that will only call gcc and ld for target, >without >> performing any LTO optimizations for host? >> Suggestions? > >The rough plan (partly discussed at the accelerator BoF) was that we >would >stream LTO bytecode into special section somewhere during ompexp pass >or so >(note, right now LTO streaming streams everything in a TU, we'd want to >stream only the routines with "omp declare target" attribute, and >outlined >#pragma omp target regions, and vars referenced from it and types >etc.), >then have some other linker plugin (-fopenmp/-fopenacc) that would >recognize >these special sections and run lto1 on those (if not -flto in some mode >that >would just compile each TU separately or something), then we need to >link >it together and let the linker put it into some section of the host >binary >or shared library.
Alternatively you make lto-wrapper aware of this which means that WPA stage would emit extra partitions that it marks for lto-wrapper. That sounds better than another plugin to me. Of course WPA time might be too limiting. Otoh the idea of multiple WPA stages, aka iterating lto could be picked up to have a late WPA stage. Richard. > Jakub