Hi Neil,

This was discussed in the CXX Mesos Slack channel yesterday.

Basically, the two are separate and independent. Regardless of stout work, I 
anticipate that PCH work will dramatically speed up the Windows build (and 
Linux too, although I have less experience in that area). I'm going to run some 
benchmarks on a subset of the code to give a good "before/after" idea of the 
speedup and report to the list.

If stout non-header-only library work is done, this will do a fair amount to 
speed up incremental builds (i.e. you just update implementation of a stout 
method, and only the related C file is rebuilt). However, the non-header-only 
work won't do anything in a "clean build" scenario. And, if course, if you 
change the interface of a stout method, all bets are off and you get to rebuild 
virtually the world.

PCH, on the other hand, will speed up all compiles across the board (using 
stout and not using stout). Now, that said, if a stout change is made (assuming 
still header-only), you will still rebuild everything, but the builds will go 
much faster. That *may* be fast enough to take the sting out of significant 
stout changes, but changing stout will still help the incremental build cases 
regardless.

Hope that clarifies,

/Jeff

-----Original Message-----
From: Neil Conway [mailto:neil.con...@gmail.com] 
Sent: Tuesday, February 14, 2017 11:45 AM
To: dev <dev@mesos.apache.org>
Subject: Re: Proposal for Mesos Build Improvements

I'm curious to hear more about how using PCH compares with making stout a 
non-header-only library. Is PCH easier to implement, or is it expected to offer 
a more dramatic improvement in compile times? Would making both changes 
eventually make sense?

Neil

On Tue, Feb 14, 2017 at 11:28 AM, Jeff Coffler 
<jeff.coff...@microsoft.com.invalid> wrote:
> Proposal For Build Improvements
>
> The Mesos build process is in dire need of some build infrastructure 
> improvements. These improvements will improve speed and ease of work in 
> particular components, and dramatically improve overall build time, 
> especially in the Windows environment, but likely in the Linux environment as 
> well.
>
>
> Background:
>
> It is currently recommended to use the ccache project with the Mesos build 
> process. This makes the Linux build process more tolerable in terms of speed, 
> but unfortunately such software is not available on Windows. Ultimately, 
> though, the caching software is covering up two fundamental flaws in the 
> overall build process:
>
> 1. Lack of use of libraries
> 2. Lack of precompiled headers
>
> By not allowing use of libraries, the overall build process is often much 
> longer, particularly when a lot of work is being done in a particular 
> component. If work is being done in a particular component, only that library 
> need be rebuilt (and then the overall image relinked). Currently, since there 
> is no such modularization, all source files must be considered at build time. 
> Interestingly enough, there is such modularization in the source code layout; 
> that modularization just isn't utilized at the compiler level.
>
> Precompiled headers exist on both Windows and Linux. For Linux, you can refer 
> to 
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2FPrecompiled-Headers.html&data=02%7C01%7CJeff.Coffler%40microsoft.com%7Cf0dfa7d79e6e43d31fa008d455120381%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636226983234972044&sdata=ljS8BJ9ZSI7Wqvk5%2Bv1oPH5c6tHZGg7FPb08nUN8JUc%3D&reserved=0.
>  Straight from the GNU CC documentation: "The time the compiler takes to 
> process these header files over and over again can account for nearly all of 
> the time required to build the project."
>
> In my prior use of precompiled headers, each C or C++ file generally took 
> about 4 seconds to compile. After switching to precompiled headers, the 
> precompiled header creation took about 4 seconds, but each C/C++ file now 
> took about 200 milliseconds to compile. The overall build speed was thus 
> dramatically reduced.
>
>
> Scope of Changes:
>
> These changes are only being proposed for the CMake system. Going forward, 
> the CMake system is the easiest way to maintain some level of portability 
> between the Linux and Windows platforms.
>
>
> Details for Modularization:
>
> For the modularization, the intent is to simply make each source directory of 
> files, if functionally separate, to be compiled into an archive (.a) file. 
> These archive files will then be linked together to form the actual 
> executables. These changes will primarily be in the CMake system, and should 
> have limited effect on any actual source code.
>
> At a later date, if it makes sense, we can look at building shared library 
> (.so) files. However, this only makes the most sense if the code is truly 
> shared between different executable files. If that's not the case, then it 
> likely makes sense just to stick with .a files. Regardless, generation of .so 
> files is out of scope for this change.
>
>
> Details for Precompiled Header Changes:
>
> Precompiled headers will make use of stout (a very large header-only library) 
> essentially "free" from a compile-time overhead point of view. Basically, 
> precompiled headers will take a list of header files (including very long 
> header files, like "windows.h"), and generate the compiler memory structures 
> for their representation.
>
> During precompiled header generation, these memory structures are flushed to 
> disk. Then, when components are built, the memory structures are reloaded 
> from disk, which is dramatically faster than actually parsing the tens of 
> thousands of lines of header files and building the memory structures.
>
> For precompiled headers to be useful, a relatively "consistent" set of 
> headers must be included by all of the C/C++ files. So, for example, consider 
> the following C file:
>
> #if defined(windows)
> #include <windows.h>
> #endif
>
> #include <header-a>
> #include <header-b>
> #include <header-c>
>
> < - Remainder of module - >
>
> To make a precompiled header for this module, all of the #include files would 
> be included in a new file, mesos_common.h. The C file would then be changed 
> as follows:
>
> #include "mesos_common.h"
>
> < - Remainder of module - >
>
> Structurally, the code is identical, and need not be built with precompiled 
> headers. However, use of precompiled headers will make file compilation 
> dramatically faster.
>
> Note that other include files can be included after the precompiled header if 
> appropriate. For example, the following is valid:
>
> #include "mesos_common.h"
> #inclue <header-d>
>
> < - Remainder of module - >
>
> For efficiency purposes, if a header file is included by 50% or more of the 
> source files, it should be included in the precompiled header. If a header is 
> included in fewer than 50% of the source files, then it can be separately 
> included (and thus would not benefit from precompiled headers). Note that 
> this is a guideline; even if a header is used by less than 50% of source 
> files, if it's very large, we still may decide to throw it in the precompiled 
> header.
>
> Note that, for use of precompiled headers, there will be a great deal 
> of code churn (almost exclusively in the #include list of source 
> files). This will mean that there will be a lot of code merges, but 
> ultimately no "code logic" changes. If merges are not done in a timely 
> fashion, this can easily result in needless hand merging of changes. 
> Due to these issues, we will need a dedicated sheppard that will 
> integrate the patches quickly. This kind of work is easily invalidated 
> when the include list is changed by another developer, necessitating 
> us to redo the patch. [Note that Joseph has stepped up to the plate 
> for this, thanks Joseph!]
>
>
> This is the end of my proposal, feedback would be appreciated.

Reply via email to