Re: Parallelize the compilation using Threads
On Tue, 2019-02-12 at 15:12 +0100, Richard Biener wrote: > On Mon, Feb 11, 2019 at 10:46 PM Giuliano Belinassi > wrote: > > > > Hi, > > > > I was just wondering what API should I use to spawn threads and > > control > > its flow. Should I use OpenMP, pthreads, or something else? > > > > My point what if we break compatibility with something. If we use > > OpenMP, I'm afraid that we will break compatibility with compilers > > not > > supporting it. On the other hand, If we use pthread, we will break > > compatibility with non-POSIX systems (Windows). > > I'm not sure we have a thread abstraction for the host - we do have > one for the target via libgcc gthr.h though. For prototyping I'd > resort > to this same interface and fixup the host != target case as needed. Or maybe, in the year 2019, we could assume that most c++ compilers which are used to compile GCC support c++11 and come with an adequate implementation... yeah, I know, sounds jacked :) Cheers, Oleg
Re: Parallelize the compilation using Threads
On Mon, Feb 11, 2019 at 10:46 PM Giuliano Belinassi wrote: > > Hi, > > I was just wondering what API should I use to spawn threads and control > its flow. Should I use OpenMP, pthreads, or something else? > > My point what if we break compatibility with something. If we use > OpenMP, I'm afraid that we will break compatibility with compilers not > supporting it. On the other hand, If we use pthread, we will break > compatibility with non-POSIX systems (Windows). I'm not sure we have a thread abstraction for the host - we do have one for the target via libgcc gthr.h though. For prototyping I'd resort to this same interface and fixup the host != target case as needed. Richard. > > Giuliano.
Re: Parallelize the compilation using Threads
Hi, I was just wondering what API should I use to spawn threads and control its flow. Should I use OpenMP, pthreads, or something else? My point what if we break compatibility with something. If we use OpenMP, I'm afraid that we will break compatibility with compilers not supporting it. On the other hand, If we use pthread, we will break compatibility with non-POSIX systems (Windows). Giuliano.
Re: Parallelize the compilation using Threads
Hi, Since gimple-match.c takes so long to compile, I was wondering if it might be possible to reorder the compilation so we can push its compilation early in the dependency graph. I've attached a graphic showing what I mean and the methodology into PR84402 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402). Maybe there is a simple change that can be made into Makefile? Or maybe an feature to Make itself to compute the elapsed time for each file and create a better scheduling for the next compilation? Giuliano. On 11/19, Richard Biener wrote: > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > wrote: > > > > Hi! Sorry for the late reply again :P > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > wrote: > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > wrote: > > > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > them have already been accepted [2]. > > > > > > > > I brought this subject up in IRC, but maybe here is a proper place to > > > > discuss this topic. > > > > > > > > From my point of view, parallelizing GCC itself will only speed up the > > > > compilation of projects which have a big file that creates a > > > > bottleneck in the whole project compilation (note: by big, I mean the > > > > amount of code to generate). > > > > > > That's true. During GCC bootstrap there are some of those (see PR84402). > > > > > > > > One way to improve parallelism is to use link-time optimization where > > > even single source files can be split up into multiple link-time units. > > > But > > > then there's the serial whole-program analysis part. > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > That is a lot of data :-) > > > > It seems that 'phase opt and generate' is the most time-consuming > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > about in this thread: > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > It's everything that comes after the frontend parsing bits, thus this > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > Additionally, I know that GCC must not > > > > change the project layout, but from the software engineering > > > > perspective, > > > > this may be a bad smell that indicates that the file should be broken > > > > into smaller files. Finally, the Makefiles will take care of the > > > > parallelization task. > > > > > > What do you mean by GCC must not change the project layout? GCC > > > happily re-orders functions and link-time optimization will reorder > > > TUs (well, linking may as well). > > > > > > > That was a response to a comment made on IRC: > > > > On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely > > wrote: > > >I think this is in response to a comment I made on IRC. Giuliano said > > >that if a project has a very large file that dominates the total build > > >time, the file should be split up into smaller pieces. I said "GCC > > >can't restructure people's code. it can only try to compile it > > >faster". We weren't referring to code transformations in the compiler > > >like re-ordering functions, but physically refactoring the source > > >code. > > > > Yes. But from one of the attachments from PR84402, it seems that such > > files exist on GCC, > > https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > > > > > My questions are: > > > > > > > > 1. Is there any project compilation that will significantly be improved > > > > if GCC runs in parallel? Do someone has data about something related > > > > to that? How about the Linux Kernel? If not, I can try to bring some. > > > > > > We do not have any data about this apart from experiments with > > > splitting up source files for PR84402. > > > > > > > 2. Did I correctly understand the goal of the parallelization? Can > > > > anyone provide extra details to me? > > > > > > You may want to search the mailing list archives since we had a > > > student
Re: Parallelize the compilation using Threads
: 98.82 4.83104.24 > 1886632 kB > Extra diagnostic checks enabled; compiler may run slowly. > Configure with --enable-checking=release to disable checks. > > real1m54.934s > user1m48.938s > sys 0m5.196s > > > Thank you > Giuliano. > > On 01/14, Richard Biener wrote: > > On Mon, Jan 14, 2019 at 12:41 PM Giuliano Belinassi > > wrote: > > > > > > Hi, > > > > > > I am currently studying the GIMPLE IR documentation and thinking about a > > > way easily gather the timing information. I was thinking about about > > > adding this feature to gcc to show/dump the elapsed time on GIMPLE. Does > > > this makes sense? Is this already implemented somewhere? Where is a good > > > way to start it? > > > > There's -ftime-report which more-or-less tells you the time spent in the > > individual passes. I think there's no overall group to count GIMPLE > > optimizers vs. RTL optimizers though. > > > > > Richard Biener: I would like to know What is your nickname in IRC :) > > > > It's richi. > > > > Richard. > > > > > Thank you, > > > Giuliano. > > > > > > On 12/17, Richard Biener wrote: > > > > On Wed, Dec 12, 2018 at 4:46 PM Giuliano Augusto Faulin Belinassi > > > > wrote: > > > > > > > > > > Hi, I have some news. :-) > > > > > > > > > > I replicated the Martin Liška experiment [1] on a 64-cores machine for > > > > > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > > > > > and I am excited to dive into this problem. As a result, I want to > > > > > propose GSoC project on this issue, starting with something like: > > > > > 1- Systematically create a benchmark for easily information > > > > > gathering. Martin Liška already made the first version of it, but I > > > > > need to improve it. > > > > > 2- Find and document the global states (Try to reduce the gcc's > > > > > global states as well). > > > > > 3- Define the parallelization strategy. > > > > > 4- First parallelization attempt. > > > > > > > > > > I also proposed this issue as a research project to my advisor and he > > > > > supported me on this idea. So I can work for at least one year on > > > > > this, and other things related to it. > > > > > > > > > > Would anyone be willing to mentor me on this? > > > > > > > > As the one who initially suggested the project I'm certainly willing > > > > to mentor you on this. > > > > > > > > Richard. > > > > > > > > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > > > > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > > > > > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > > > > > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > > > > > wrote: > > > > > > > > > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > > > > > wrote: > > > > > > > > > > > > > > Hi! Sorry for the late reply again :P > > > > > > > > > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > > > > > wrote: > > > > > > > > > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin > > > > > > > > Belinassi > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > As a brief introduction, I am a graduate student that got > > > > > > > > > interested > > > > > > > > > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 > > > > > > > > > [1]). I > > > > > > > > > am a newcommer in GCC, but already have sent some patches, > > > > > > > > > some of > > > > > > > > > them have already been accepted [2]. > > > > > > > > > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper > > > > > > > > > place to > > > &g
Re: Parallelize the compilation using Threads
As the one who initially suggested the project I'm certainly willing > > > to mentor you on this. > > > > > > Richard. > > > > > > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > > > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > > > > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > > > > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > > > > wrote: > > > > > > > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > > > > wrote: > > > > > > > > > > > > Hi! Sorry for the late reply again :P > > > > > > > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > > > > wrote: > > > > > > > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > > > > wrote: > > > > > > > > > > > > > > > > As a brief introduction, I am a graduate student that got > > > > > > > > interested > > > > > > > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 > > > > > > > > [1]). I > > > > > > > > am a newcommer in GCC, but already have sent some patches, some > > > > > > > > of > > > > > > > > them have already been accepted [2]. > > > > > > > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper > > > > > > > > place to > > > > > > > > discuss this topic. > > > > > > > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed > > > > > > > > up the > > > > > > > > compilation of projects which have a big file that creates a > > > > > > > > bottleneck in the whole project compilation (note: by big, I > > > > > > > > mean the > > > > > > > > amount of code to generate). > > > > > > > > > > > > > > That's true. During GCC bootstrap there are some of those (see > > > > > > > PR84402). > > > > > > > > > > > > > > > > > > > > One way to improve parallelism is to use link-time optimization > > > > > > > where > > > > > > > even single source files can be split up into multiple link-time > > > > > > > units. But > > > > > > > then there's the serial whole-program analysis part. > > > > > > > > > > > > Did you mean this: > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > > > > > That is a lot of data :-) > > > > > > > > > > > > It seems that 'phase opt and generate' is the most time-consuming > > > > > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > > > > > about in this thread: > > > > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > > > > > > > > > It's everything that comes after the frontend parsing bits, thus this > > > > > includes in particular RTL optimization and early GIMPLE > > > > > optimizations. > > > > > > > > > > > > > Additionally, I know that GCC must not > > > > > > > > change the project layout, but from the software engineering > > > > > > > > perspective, > > > > > > > > this may be a bad smell that indicates that the file should be > > > > > > > > broken > > > > > > > > into smaller files. Finally, the Makefiles will take care of the > > > > > > > > parallelization task. > > > > > > > > > > > > > > What do you mean by GCC must not change the project layout? GCC > > > > > > > happily re-orders functions and link-time optimization will > > > > > > > reorder > > > > > > > TUs (well, linking may as well). > > > > > > > > > > > > > > > > > > > That was a response to a comment made on IRC: > &g
Re: Parallelize the compilation using Threads
On Mon, Jan 14, 2019 at 12:41 PM Giuliano Belinassi wrote: > > Hi, > > I am currently studying the GIMPLE IR documentation and thinking about a > way easily gather the timing information. I was thinking about about > adding this feature to gcc to show/dump the elapsed time on GIMPLE. Does > this makes sense? Is this already implemented somewhere? Where is a good > way to start it? There's -ftime-report which more-or-less tells you the time spent in the individual passes. I think there's no overall group to count GIMPLE optimizers vs. RTL optimizers though. > Richard Biener: I would like to know What is your nickname in IRC :) It's richi. Richard. > Thank you, > Giuliano. > > On 12/17, Richard Biener wrote: > > On Wed, Dec 12, 2018 at 4:46 PM Giuliano Augusto Faulin Belinassi > > wrote: > > > > > > Hi, I have some news. :-) > > > > > > I replicated the Martin Liška experiment [1] on a 64-cores machine for > > > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > > > and I am excited to dive into this problem. As a result, I want to > > > propose GSoC project on this issue, starting with something like: > > > 1- Systematically create a benchmark for easily information > > > gathering. Martin Liška already made the first version of it, but I > > > need to improve it. > > > 2- Find and document the global states (Try to reduce the gcc's > > > global states as well). > > > 3- Define the parallelization strategy. > > > 4- First parallelization attempt. > > > > > > I also proposed this issue as a research project to my advisor and he > > > supported me on this idea. So I can work for at least one year on > > > this, and other things related to it. > > > > > > Would anyone be willing to mentor me on this? > > > > As the one who initially suggested the project I'm certainly willing > > to mentor you on this. > > > > Richard. > > > > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > > > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > > > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > > > wrote: > > > > > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > > > wrote: > > > > > > > > > > Hi! Sorry for the late reply again :P > > > > > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > > > wrote: > > > > > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > > > wrote: > > > > > > > > > > > > > > As a brief introduction, I am a graduate student that got > > > > > > > interested > > > > > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 > > > > > > > [1]). I > > > > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > > > > them have already been accepted [2]. > > > > > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper > > > > > > > place to > > > > > > > discuss this topic. > > > > > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed > > > > > > > up the > > > > > > > compilation of projects which have a big file that creates a > > > > > > > bottleneck in the whole project compilation (note: by big, I mean > > > > > > > the > > > > > > > amount of code to generate). > > > > > > > > > > > > That's true. During GCC bootstrap there are some of those (see > > > > > > PR84402). > > > > > > > > > > > > > > > > > One way to improve parallelism is to use link-time optimization > > > > > > where > > > > > > even single source files can be split up into multiple link-time > > > > > > units. But > > > > > > then there's the serial whole-program analysis part. > > > > > > > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 > > > > > ? > >
Re: Parallelize the compilation using Threads
Hi, I am currently studying the GIMPLE IR documentation and thinking about a way easily gather the timing information. I was thinking about about adding this feature to gcc to show/dump the elapsed time on GIMPLE. Does this makes sense? Is this already implemented somewhere? Where is a good way to start it? Richard Biener: I would like to know What is your nickname in IRC :) Thank you, Giuliano. On 12/17, Richard Biener wrote: > On Wed, Dec 12, 2018 at 4:46 PM Giuliano Augusto Faulin Belinassi > wrote: > > > > Hi, I have some news. :-) > > > > I replicated the Martin Liška experiment [1] on a 64-cores machine for > > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > > and I am excited to dive into this problem. As a result, I want to > > propose GSoC project on this issue, starting with something like: > > 1- Systematically create a benchmark for easily information > > gathering. Martin Liška already made the first version of it, but I > > need to improve it. > > 2- Find and document the global states (Try to reduce the gcc's > > global states as well). > > 3- Define the parallelization strategy. > > 4- First parallelization attempt. > > > > I also proposed this issue as a research project to my advisor and he > > supported me on this idea. So I can work for at least one year on > > this, and other things related to it. > > > > Would anyone be willing to mentor me on this? > > As the one who initially suggested the project I'm certainly willing > to mentor you on this. > > Richard. > > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > > wrote: > > > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > > wrote: > > > > > > > > Hi! Sorry for the late reply again :P > > > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > > wrote: > > > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > > wrote: > > > > > > > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > > > them have already been accepted [2]. > > > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper place > > > > > > to > > > > > > discuss this topic. > > > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed up > > > > > > the > > > > > > compilation of projects which have a big file that creates a > > > > > > bottleneck in the whole project compilation (note: by big, I mean > > > > > > the > > > > > > amount of code to generate). > > > > > > > > > > That's true. During GCC bootstrap there are some of those (see > > > > > PR84402). > > > > > > > > > > > > > > One way to improve parallelism is to use link-time optimization where > > > > > even single source files can be split up into multiple link-time > > > > > units. But > > > > > then there's the serial whole-program analysis part. > > > > > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > > > That is a lot of data :-) > > > > > > > > It seems that 'phase opt and generate' is the most time-consuming > > > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > > > about in this thread: > > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > > > > > It's everything that comes after the frontend parsing bits, thus this > > > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > > > > > Additionally, I know that GCC must not > > > > > > change the project layout, but from the software engineering > > > > > > perspective, > > > > > > this may be a bad smell t
Re: Parallelize the compilation using Threads
On Wed, Dec 12, 2018 at 4:46 PM Giuliano Augusto Faulin Belinassi wrote: > > Hi, I have some news. :-) > > I replicated the Martin Liška experiment [1] on a 64-cores machine for > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > and I am excited to dive into this problem. As a result, I want to > propose GSoC project on this issue, starting with something like: > 1- Systematically create a benchmark for easily information > gathering. Martin Liška already made the first version of it, but I > need to improve it. > 2- Find and document the global states (Try to reduce the gcc's > global states as well). > 3- Define the parallelization strategy. > 4- First parallelization attempt. > > I also proposed this issue as a research project to my advisor and he > supported me on this idea. So I can work for at least one year on > this, and other things related to it. > > Would anyone be willing to mentor me on this? As the one who initially suggested the project I'm certainly willing to mentor you on this. Richard. > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > wrote: > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > wrote: > > > > > > Hi! Sorry for the late reply again :P > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > wrote: > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > wrote: > > > > > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > > them have already been accepted [2]. > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper place to > > > > > discuss this topic. > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed up the > > > > > compilation of projects which have a big file that creates a > > > > > bottleneck in the whole project compilation (note: by big, I mean the > > > > > amount of code to generate). > > > > > > > > That's true. During GCC bootstrap there are some of those (see > > > > PR84402). > > > > > > > > > > > One way to improve parallelism is to use link-time optimization where > > > > even single source files can be split up into multiple link-time units. > > > > But > > > > then there's the serial whole-program analysis part. > > > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > > That is a lot of data :-) > > > > > > It seems that 'phase opt and generate' is the most time-consuming > > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > > about in this thread: > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > > > It's everything that comes after the frontend parsing bits, thus this > > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > > > Additionally, I know that GCC must not > > > > > change the project layout, but from the software engineering > > > > > perspective, > > > > > this may be a bad smell that indicates that the file should be broken > > > > > into smaller files. Finally, the Makefiles will take care of the > > > > > parallelization task. > > > > > > > > What do you mean by GCC must not change the project layout? GCC > > > > happily re-orders functions and link-time optimization will reorder > > > > TUs (well, linking may as well). > > > > > > > > > > That was a response to a comment made on IRC: > > > > > > On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely > > > wrote: > > > >I think this is in response to a comment I made on IRC. Giuliano said > > > >that if a project has a very large file that dominates the total build > > > >time, the file should be split up into smaller pieces. I said "GCC > > > >can't re
Re: Parallelize the compilation using Threads
Hi, See comments inline. On 12/13, Bin.Cheng wrote: > On Wed, Dec 12, 2018 at 11:46 PM Giuliano Augusto Faulin Belinassi > wrote: > > > > Hi, I have some news. :-) > > > > I replicated the Martin Liška experiment [1] on a 64-cores machine for > > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > > and I am excited to dive into this problem. As a result, I want to > > propose GSoC project on this issue, starting with something like: > > 1- Systematically create a benchmark for easily information > > gathering. Martin Liška already made the first version of it, but I > > need to improve it. > > 2- Find and document the global states (Try to reduce the gcc's > > global states as well). > > 3- Define the parallelization strategy. > > 4- First parallelization attempt. > Hi Giuliano, > > Thanks very much for working on this. It could be very useful, for > example, one bottleneck we have is slow compilation of big single > source file after intensively using distribution compilation. Of > course, a good parallelization strategy is needed. > Interesting. How many lines the generated file has? Does it uses C++ templates? The generated gimple-match.c file, for example, has 98786 lines and takes about 30s to compile. > Thanks, > bin > > > > I also proposed this issue as a research project to my advisor and he > > supported me on this idea. So I can work for at least one year on > > this, and other things related to it. > > > > Would anyone be willing to mentor me on this? > > > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > > wrote: > > > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > > wrote: > > > > > > > > Hi! Sorry for the late reply again :P > > > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > > wrote: > > > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > > wrote: > > > > > > > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > > > them have already been accepted [2]. > > > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper place > > > > > > to > > > > > > discuss this topic. > > > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed up > > > > > > the > > > > > > compilation of projects which have a big file that creates a > > > > > > bottleneck in the whole project compilation (note: by big, I mean > > > > > > the > > > > > > amount of code to generate). > > > > > > > > > > That's true. During GCC bootstrap there are some of those (see > > > > > PR84402). > > > > > > > > > > > > > > One way to improve parallelism is to use link-time optimization where > > > > > even single source files can be split up into multiple link-time > > > > > units. But > > > > > then there's the serial whole-program analysis part. > > > > > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > > > That is a lot of data :-) > > > > > > > > It seems that 'phase opt and generate' is the most time-consuming > > > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > > > about in this thread: > > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > > > > > It's everything that comes after the frontend parsing bits, thus this > > > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > > > > > Additionally, I know that GCC must not > > > > > > change the project layout, but from the software engineering > > > > > > perspective, > > > > > > this may be a b
Re: Parallelize the compilation using Threads
On Wed, Dec 12, 2018 at 11:46 PM Giuliano Augusto Faulin Belinassi wrote: > > Hi, I have some news. :-) > > I replicated the Martin Liška experiment [1] on a 64-cores machine for > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > and I am excited to dive into this problem. As a result, I want to > propose GSoC project on this issue, starting with something like: > 1- Systematically create a benchmark for easily information > gathering. Martin Liška already made the first version of it, but I > need to improve it. > 2- Find and document the global states (Try to reduce the gcc's > global states as well). > 3- Define the parallelization strategy. > 4- First parallelization attempt. Hi Giuliano, Thanks very much for working on this. It could be very useful, for example, one bottleneck we have is slow compilation of big single source file after intensively using distribution compilation. Of course, a good parallelization strategy is needed. Thanks, bin > > I also proposed this issue as a research project to my advisor and he > supported me on this idea. So I can work for at least one year on > this, and other things related to it. > > Would anyone be willing to mentor me on this? > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > wrote: > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > wrote: > > > > > > Hi! Sorry for the late reply again :P > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > wrote: > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > wrote: > > > > > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > > them have already been accepted [2]. > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper place to > > > > > discuss this topic. > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed up the > > > > > compilation of projects which have a big file that creates a > > > > > bottleneck in the whole project compilation (note: by big, I mean the > > > > > amount of code to generate). > > > > > > > > That's true. During GCC bootstrap there are some of those (see > > > > PR84402). > > > > > > > > > > > One way to improve parallelism is to use link-time optimization where > > > > even single source files can be split up into multiple link-time units. > > > > But > > > > then there's the serial whole-program analysis part. > > > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > > That is a lot of data :-) > > > > > > It seems that 'phase opt and generate' is the most time-consuming > > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > > about in this thread: > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > > > It's everything that comes after the frontend parsing bits, thus this > > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > > > Additionally, I know that GCC must not > > > > > change the project layout, but from the software engineering > > > > > perspective, > > > > > this may be a bad smell that indicates that the file should be broken > > > > > into smaller files. Finally, the Makefiles will take care of the > > > > > parallelization task. > > > > > > > > What do you mean by GCC must not change the project layout? GCC > > > > happily re-orders functions and link-time optimization will reorder > > > > TUs (well, linking may as well). > > > > > > > > > > That was a response to a comment made on IRC: > > > > > > On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely > > > wrote: > > > >I think this is in response to a comment I made on IRC. Giuliano said > > > >that if a project has a very la
Re: Parallelize the compilation using Threads
Hi, I have some news. :-) I replicated the Martin Liška experiment [1] on a 64-cores machine for gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), and I am excited to dive into this problem. As a result, I want to propose GSoC project on this issue, starting with something like: 1- Systematically create a benchmark for easily information gathering. Martin Liška already made the first version of it, but I need to improve it. 2- Find and document the global states (Try to reduce the gcc's global states as well). 3- Define the parallelization strategy. 4- First parallelization attempt. I also proposed this issue as a research project to my advisor and he supported me on this idea. So I can work for at least one year on this, and other things related to it. Would anyone be willing to mentor me on this? [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg On Mon, Nov 19, 2018 at 8:53 AM Richard Biener wrote: > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > wrote: > > > > Hi! Sorry for the late reply again :P > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > wrote: > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > wrote: > > > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > them have already been accepted [2]. > > > > > > > > I brought this subject up in IRC, but maybe here is a proper place to > > > > discuss this topic. > > > > > > > > From my point of view, parallelizing GCC itself will only speed up the > > > > compilation of projects which have a big file that creates a > > > > bottleneck in the whole project compilation (note: by big, I mean the > > > > amount of code to generate). > > > > > > That's true. During GCC bootstrap there are some of those (see PR84402). > > > > > > > > One way to improve parallelism is to use link-time optimization where > > > even single source files can be split up into multiple link-time units. > > > But > > > then there's the serial whole-program analysis part. > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > > That is a lot of data :-) > > > > It seems that 'phase opt and generate' is the most time-consuming > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > about in this thread: > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > It's everything that comes after the frontend parsing bits, thus this > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > Additionally, I know that GCC must not > > > > change the project layout, but from the software engineering > > > > perspective, > > > > this may be a bad smell that indicates that the file should be broken > > > > into smaller files. Finally, the Makefiles will take care of the > > > > parallelization task. > > > > > > What do you mean by GCC must not change the project layout? GCC > > > happily re-orders functions and link-time optimization will reorder > > > TUs (well, linking may as well). > > > > > > > That was a response to a comment made on IRC: > > > > On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely > > wrote: > > >I think this is in response to a comment I made on IRC. Giuliano said > > >that if a project has a very large file that dominates the total build > > >time, the file should be split up into smaller pieces. I said "GCC > > >can't restructure people's code. it can only try to compile it > > >faster". We weren't referring to code transformations in the compiler > > >like re-ordering functions, but physically refactoring the source > > >code. > > > > Yes. But from one of the attachments from PR84402, it seems that such > > files exist on GCC, > > https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > > > > > My questions are: > > > > > > > > 1. Is there any project compilation that will significantly be improved > > > > if GCC runs in parallel? Do someone has data
Re: Parallelize the compilation using Threads
On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi wrote: > > Hi! Sorry for the late reply again :P > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > wrote: > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > wrote: > > > > > > As a brief introduction, I am a graduate student that got interested > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > > am a newcommer in GCC, but already have sent some patches, some of > > > them have already been accepted [2]. > > > > > > I brought this subject up in IRC, but maybe here is a proper place to > > > discuss this topic. > > > > > > From my point of view, parallelizing GCC itself will only speed up the > > > compilation of projects which have a big file that creates a > > > bottleneck in the whole project compilation (note: by big, I mean the > > > amount of code to generate). > > > > That's true. During GCC bootstrap there are some of those (see PR84402). > > > > > One way to improve parallelism is to use link-time optimization where > > even single source files can be split up into multiple link-time units. But > > then there's the serial whole-program analysis part. > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? > That is a lot of data :-) > > It seems that 'phase opt and generate' is the most time-consuming > part. Is that the 'GIMPLE optimization pipeline' you were talking > about in this thread: > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html It's everything that comes after the frontend parsing bits, thus this includes in particular RTL optimization and early GIMPLE optimizations. > > > Additionally, I know that GCC must not > > > change the project layout, but from the software engineering perspective, > > > this may be a bad smell that indicates that the file should be broken > > > into smaller files. Finally, the Makefiles will take care of the > > > parallelization task. > > > > What do you mean by GCC must not change the project layout? GCC > > happily re-orders functions and link-time optimization will reorder > > TUs (well, linking may as well). > > > > That was a response to a comment made on IRC: > > On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely wrote: > >I think this is in response to a comment I made on IRC. Giuliano said > >that if a project has a very large file that dominates the total build > >time, the file should be split up into smaller pieces. I said "GCC > >can't restructure people's code. it can only try to compile it > >faster". We weren't referring to code transformations in the compiler > >like re-ordering functions, but physically refactoring the source > >code. > > Yes. But from one of the attachments from PR84402, it seems that such > files exist on GCC, > https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > > > My questions are: > > > > > > 1. Is there any project compilation that will significantly be improved > > > if GCC runs in parallel? Do someone has data about something related > > > to that? How about the Linux Kernel? If not, I can try to bring some. > > > > We do not have any data about this apart from experiments with > > splitting up source files for PR84402. > > > > > 2. Did I correctly understand the goal of the parallelization? Can > > > anyone provide extra details to me? > > > > You may want to search the mailing list archives since we had a > > student application (later revoked) for the task with some discussion. > > > > In my view (I proposed the thing) the most interesting parts are > > getting GCCs global state documented and reduced. The parallelization > > itself is an interesting experiment but whether there will be any > > substantial improvement for builds that can already benefit from make > > parallelism remains a question. > > As I agree that documenting GCC's global states is good for the > community and the development of GCC, I really don't think this a good > motivation for parallelizing a compiler from a research standpoint. True ;) Note that my suggestions to the other GSoC student were purely based on where it's easiest to experiment with paralellization and not where it would be most beneficial. > There must be something or someone that could take advantage of the > fine-grained parallelism. But that data from PR84402 seems to have the > answer to it. :-) &
Re: Parallelize the compilation using Threads
Hi! Sorry for the late reply again :P On Thu, Nov 15, 2018 at 8:29 AM Richard Biener wrote: > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > wrote: > > > > As a brief introduction, I am a graduate student that got interested > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > am a newcommer in GCC, but already have sent some patches, some of > > them have already been accepted [2]. > > > > I brought this subject up in IRC, but maybe here is a proper place to > > discuss this topic. > > > > From my point of view, parallelizing GCC itself will only speed up the > > compilation of projects which have a big file that creates a > > bottleneck in the whole project compilation (note: by big, I mean the > > amount of code to generate). > > That's true. During GCC bootstrap there are some of those (see PR84402). > > One way to improve parallelism is to use link-time optimization where > even single source files can be split up into multiple link-time units. But > then there's the serial whole-program analysis part. Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? That is a lot of data :-) It seems that 'phase opt and generate' is the most time-consuming part. Is that the 'GIMPLE optimization pipeline' you were talking about in this thread: https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > Additionally, I know that GCC must not > > change the project layout, but from the software engineering perspective, > > this may be a bad smell that indicates that the file should be broken > > into smaller files. Finally, the Makefiles will take care of the > > parallelization task. > > What do you mean by GCC must not change the project layout? GCC > happily re-orders functions and link-time optimization will reorder > TUs (well, linking may as well). > That was a response to a comment made on IRC: On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely wrote: >I think this is in response to a comment I made on IRC. Giuliano said >that if a project has a very large file that dominates the total build >time, the file should be split up into smaller pieces. I said "GCC >can't restructure people's code. it can only try to compile it >faster". We weren't referring to code transformations in the compiler >like re-ordering functions, but physically refactoring the source >code. Yes. But from one of the attachments from PR84402, it seems that such files exist on GCC, https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > My questions are: > > > > 1. Is there any project compilation that will significantly be improved > > if GCC runs in parallel? Do someone has data about something related > > to that? How about the Linux Kernel? If not, I can try to bring some. > > We do not have any data about this apart from experiments with > splitting up source files for PR84402. > > > 2. Did I correctly understand the goal of the parallelization? Can > > anyone provide extra details to me? > > You may want to search the mailing list archives since we had a > student application (later revoked) for the task with some discussion. > > In my view (I proposed the thing) the most interesting parts are > getting GCCs global state documented and reduced. The parallelization > itself is an interesting experiment but whether there will be any > substantial improvement for builds that can already benefit from make > parallelism remains a question. As I agree that documenting GCC's global states is good for the community and the development of GCC, I really don't think this a good motivation for parallelizing a compiler from a research standpoint. There must be something or someone that could take advantage of the fine-grained parallelism. But that data from PR84402 seems to have the answer to it. :-) On Thu, Nov 15, 2018 at 4:07 PM Szabolcs Nagy wrote: > > On 15/11/18 10:29, Richard Biener wrote: > > In my view (I proposed the thing) the most interesting parts are > > getting GCCs global state documented and reduced. The parallelization > > itself is an interesting experiment but whether there will be any > > substantial improvement for builds that can already benefit from make > > parallelism remains a question. > > in the common case (project with many small files, much more than > core count) i'd expect a regression: > > if gcc itself tries to parallelize that introduces inter thread > synchronization and potential false sharing in gcc (e.g. malloc > locks) that does not exist with make parallelism (glibc can avoid > some atomic instructions when a process is si
Re: Parallelize the compilation using Threads
Hi Giuliano, On Thu, Nov 15 2018, Richard Biener wrote: > You may want to search the mailing list archives since we had a > student application (later revoked) for the task with some discussion. Specifically, the whole thread beginning with https://gcc.gnu.org/ml/gcc/2018-03/msg00179.html Martin
Re: Parallelize the compilation using Threads
On 15/11/18 10:29, Richard Biener wrote: > In my view (I proposed the thing) the most interesting parts are > getting GCCs global state documented and reduced. The parallelization > itself is an interesting experiment but whether there will be any > substantial improvement for builds that can already benefit from make > parallelism remains a question. in the common case (project with many small files, much more than core count) i'd expect a regression: if gcc itself tries to parallelize that introduces inter thread synchronization and potential false sharing in gcc (e.g. malloc locks) that does not exist with make parallelism (glibc can avoid some atomic instructions when a process is single threaded).
Re: Parallelize the compilation using Threads
On 11/15/18 3:29 AM, Richard Biener wrote: > >> 2. Did I correctly understand the goal of the parallelization? Can >> anyone provide extra details to me? > > You may want to search the mailing list archives since we had a > student application (later revoked) for the task with some discussion. > > In my view (I proposed the thing) the most interesting parts are > getting GCCs global state documented and reduced. The parallelization > itself is an interesting experiment but whether there will be any > substantial improvement for builds that can already benefit from make > parallelism remains a question. Agreed. Driving down the amount of global state is good in and of itself. It's also a prerequisite for parallelizing GCC itself using threads. I suspect driving down global state probably isn't that interesting for a master's thesis though :-) jeff
Re: Parallelize the compilation using Threads
On Thu, 15 Nov 2018 at 10:29, Richard Biener wrote: > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > wrote: > > Additionally, I know that GCC must not > > change the project layout, but from the software engineering perspective, > > this may be a bad smell that indicates that the file should be broken > > into smaller files. Finally, the Makefiles will take care of the > > parallelization task. > > What do you mean by GCC must not change the project layout? I think this is in response to a comment I made on IRC. Giuliano said that if a project has a very large file that dominates the total build time, the file should be split up into smaller pieces. I said "GCC can't restructure people's code. it can only try to compile it faster". We weren't referring to code transformations in the compiler like re-ordering functions, but physically refactoring the source code.
Re: Parallelize the compilation using Threads
On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi wrote: > > As a brief introduction, I am a graduate student that got interested > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > am a newcommer in GCC, but already have sent some patches, some of > them have already been accepted [2]. > > I brought this subject up in IRC, but maybe here is a proper place to > discuss this topic. > > From my point of view, parallelizing GCC itself will only speed up the > compilation of projects which have a big file that creates a > bottleneck in the whole project compilation (note: by big, I mean the > amount of code to generate). That's true. During GCC bootstrap there are some of those (see PR84402). One way to improve parallelism is to use link-time optimization where even single source files can be split up into multiple link-time units. But then there's the serial whole-program analysis part. > Additionally, I know that GCC must not > change the project layout, but from the software engineering perspective, > this may be a bad smell that indicates that the file should be broken > into smaller files. Finally, the Makefiles will take care of the > parallelization task. What do you mean by GCC must not change the project layout? GCC happily re-orders functions and link-time optimization will reorder TUs (well, linking may as well). > My questions are: > > 1. Is there any project compilation that will significantly be improved > if GCC runs in parallel? Do someone has data about something related > to that? How about the Linux Kernel? If not, I can try to bring some. We do not have any data about this apart from experiments with splitting up source files for PR84402. > 2. Did I correctly understand the goal of the parallelization? Can > anyone provide extra details to me? You may want to search the mailing list archives since we had a student application (later revoked) for the task with some discussion. In my view (I proposed the thing) the most interesting parts are getting GCCs global state documented and reduced. The parallelization itself is an interesting experiment but whether there will be any substantial improvement for builds that can already benefit from make parallelism remains a question. > I am willing to turn my master’s thesis on that and also apply to GSoC > 2019 if it shows to be fruitful. > > [1] https://gcc.gnu.org/wiki/SummerOfCode > [2] https://patchwork.ozlabs.org/project/gcc/list/?submitter=74682 > > > Thanks
Parallelize the compilation using Threads
As a brief introduction, I am a graduate student that got interested in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I am a newcommer in GCC, but already have sent some patches, some of them have already been accepted [2]. I brought this subject up in IRC, but maybe here is a proper place to discuss this topic. From my point of view, parallelizing GCC itself will only speed up the compilation of projects which have a big file that creates a bottleneck in the whole project compilation (note: by big, I mean the amount of code to generate). Additionally, I know that GCC must not change the project layout, but from the software engineering perspective, this may be a bad smell that indicates that the file should be broken into smaller files. Finally, the Makefiles will take care of the parallelization task. My questions are: 1. Is there any project compilation that will significantly be improved if GCC runs in parallel? Do someone has data about something related to that? How about the Linux Kernel? If not, I can try to bring some. 2. Did I correctly understand the goal of the parallelization? Can anyone provide extra details to me? I am willing to turn my master’s thesis on that and also apply to GSoC 2019 if it shows to be fruitful. [1] https://gcc.gnu.org/wiki/SummerOfCode [2] https://patchwork.ozlabs.org/project/gcc/list/?submitter=74682 Thanks