Re: [Scons-dev] SCons performance investigations

2017-07-23 Thread Bill Deegan
Jason,

I added your comments most annoted with (JK). And your general thoughts at
the bottom attributed to you.
It may make sense to reorganize and add discussion sub pages and better sum
up the collective thoughts.

-Bill

On Sat, Jul 22, 2017 at 2:23 PM, Bill Deegan 
wrote:

> Jason,
>
> Any chance you could add these comments to the wiki page?
> https://bitbucket.org/scons/scons/wiki/NeedForSpeed
>
> -Bill
>
> On Sat, Jul 22, 2017 at 10:09 AM, Jason Kenny  wrote:
>
>> Some additional thoughts
>>
>>
>>
>> Serial DAG traversal:
>>
>>- On the issue here as well is that the Dag for doing builds is based
>>on nodes. There is a bit of logic to deal with handing side effects and
>>build actions that have multiple outputs. Greg Noel had made a push for
>>something called TNG taskmaster. I understand now the main fix he was 
>> going
>>for is to tweak SCons to navigate a builder Dag instead of Node DAG, the
>>node Dag is great to get the main organization but after that it is
>>generally trivial to make a DAG based on builder at the same time,
>>Traversing this is much faster, we require less “special” logic and will 
>> be
>>easier to parallelize.
>>   - On big improvement this provides is that we only need to test if
>>   the sources or targets are out of date if the dependent builders are 
>> all up
>>   to date. If one of the is out of date, we just build, This vs we check 
>> each
>>   node and see if the build action has been done which requires extra 
>> scans
>>   and work in the current logic.
>>   - Given a builder is out of data you just mark all parents out of
>>   date. We only care about builders in a set that we don’t know are out 
>> of
>>   date yet. Simple tweaks on how we go through the tree can mean we only 
>> need
>>   to touch a few nodes.
>>
>> Start up time:
>>
>>- Zero build time is going to be the worse case for a build up to
>>date, as we have to make sure all items are in a good state. Time to start
>>building on diff should be a lot faster. Scons spends a lot of time having
>>to read everything on second passes. We can use our cache much better to
>>store states on what builds what, etc to avoid even having to read a file.
>>If the file did not change we already know the node/builder tree it will
>>provide. We already know the actions. We can start building items as soon
>>as a md5/time stamp check fails most of the time. Globs can store
>>information about what it read and processed and only need to go off when
>>we notice a directory timestamp. Avoiding processing build files and
>>loading known state is much faster than processing the python code. My 
>> work
>>in Parts has shown this. The trick is knowing when you might have to load 
>> a
>>file again to make sure custom logic get processed correctly.
>>- In the case of Parts it would be great to load file concurrently
>>and in parallel. I think I have a way to go this concurrently which I have
>>not done yet. The main issue is the node FS object tree is a sync point 
>> for
>>being parallel.
>>
>> CacheDir:
>>
>> 100% agree..
>>
>> SConsign generation:
>>
>>- I think this is a bigger deal for larger builds. I have found in
>>Parts, as I store more data I would try to break up the items into
>>different files. This helps, but in the end, at some point a pickle or 
>> JSON
>>dump takes times. It also takes time to load them as in cases for builds I
>>have had, loading 700mb files takes even the best systems a moment to do.
>>This is a big waste when I only need to get a little bit of data. 
>> Likewise,
>>the storing of the data could and should be happening as we build items. 
>> As
>>noted we don’t have a good way to store a single item without storing all
>>the file. If the file is large 100MB to GBs this can take time, as in many
>>seconds, which in the end annoy users. I would say with what I do have
>>working well in Parts that the data storage, retrieval is the big time
>>suck. Addressing this would have the largest impact me.
>>
>> Process spawning:
>>
>>- I add this as We had submitted a sub process fix for POSIX systems.
>>The code effect larger builds more than smaller builds because of forking
>>behavior. I don’t believe it been added to SCons as of yet.
>>- As a side design note, If we did make a multiprocessing setup for
>>SCons, This might be less of an issue, as the “process” workers only need
>>information about a build to run on. Changing of nodes state would have to
>>be synced with the main process via messages as there would be no fast
>>efficient way to share the whole tree across all the process.
>>- Another thought is we might want to look at some nested parallel
>>strategies to make a task like setup that might allow us to use the TBB
>>python library to avoid the 

Re: [Scons-dev] SCons performance investigations

2017-07-23 Thread Bill Deegan
Jonathon,

I've seen the clone before passing in other builds.
I'm wondering if you could put your environment in a read-only mode before
passing it (not allow changes to be made), would that suffice and remove
the desire/need to clone()?

-Bill

On Sun, Jul 23, 2017 at 4:51 PM, Jonathon Reinhart <
jonathon.reinh...@gmail.com> wrote:

> I just wanted to add some quick anecdotes.  In some of our largest, most
> complicated builds, we have observed a lot of the same things as you all
> have.
>
> One time we did some quick profiling, and saw that much CPU time during a
> null build was spent in the variable substitution.
>
> Additionally, we also have a habit of cloning the environment before
> passing it to a SConscript. This is for safety - to ensure that a child
> SConscript can't mess up the environment for its siblings.
>
>
> Jonathon Reinhart
>
>
> On Sat, Jul 22, 2017 at 5:23 PM, Bill Deegan 
> wrote:
>
>> Jason,
>>
>> Any chance you could add these comments to the wiki page?
>> https://bitbucket.org/scons/scons/wiki/NeedForSpeed
>>
>> -Bill
>>
>> On Sat, Jul 22, 2017 at 10:09 AM, Jason Kenny  wrote:
>>
>>> Some additional thoughts
>>>
>>>
>>>
>>> Serial DAG traversal:
>>>
>>>- On the issue here as well is that the Dag for doing builds is
>>>based on nodes. There is a bit of logic to deal with handing side effects
>>>and build actions that have multiple outputs. Greg Noel had made a push 
>>> for
>>>something called TNG taskmaster. I understand now the main fix he was 
>>> going
>>>for is to tweak SCons to navigate a builder Dag instead of Node DAG, the
>>>node Dag is great to get the main organization but after that it is
>>>generally trivial to make a DAG based on builder at the same time,
>>>Traversing this is much faster, we require less “special” logic and will 
>>> be
>>>easier to parallelize.
>>>   - On big improvement this provides is that we only need to test
>>>   if the sources or targets are out of date if the dependent builders 
>>> are all
>>>   up to date. If one of the is out of date, we just build, This vs we 
>>> check
>>>   each node and see if the build action has been done which requires 
>>> extra
>>>   scans and work in the current logic.
>>>   - Given a builder is out of data you just mark all parents out of
>>>   date. We only care about builders in a set that we don’t know are out 
>>> of
>>>   date yet. Simple tweaks on how we go through the tree can mean we 
>>> only need
>>>   to touch a few nodes.
>>>
>>> Start up time:
>>>
>>>- Zero build time is going to be the worse case for a build up to
>>>date, as we have to make sure all items are in a good state. Time to 
>>> start
>>>building on diff should be a lot faster. Scons spends a lot of time 
>>> having
>>>to read everything on second passes. We can use our cache much better to
>>>store states on what builds what, etc to avoid even having to read a 
>>> file.
>>>If the file did not change we already know the node/builder tree it will
>>>provide. We already know the actions. We can start building items as soon
>>>as a md5/time stamp check fails most of the time. Globs can store
>>>information about what it read and processed and only need to go off when
>>>we notice a directory timestamp. Avoiding processing build files and
>>>loading known state is much faster than processing the python code. My 
>>> work
>>>in Parts has shown this. The trick is knowing when you might have to 
>>> load a
>>>file again to make sure custom logic get processed correctly.
>>>- In the case of Parts it would be great to load file concurrently
>>>and in parallel. I think I have a way to go this concurrently which I 
>>> have
>>>not done yet. The main issue is the node FS object tree is a sync point 
>>> for
>>>being parallel.
>>>
>>> CacheDir:
>>>
>>> 100% agree..
>>>
>>> SConsign generation:
>>>
>>>- I think this is a bigger deal for larger builds. I have found in
>>>Parts, as I store more data I would try to break up the items into
>>>different files. This helps, but in the end, at some point a pickle or 
>>> JSON
>>>dump takes times. It also takes time to load them as in cases for builds 
>>> I
>>>have had, loading 700mb files takes even the best systems a moment to do.
>>>This is a big waste when I only need to get a little bit of data. 
>>> Likewise,
>>>the storing of the data could and should be happening as we build items. 
>>> As
>>>noted we don’t have a good way to store a single item without storing all
>>>the file. If the file is large 100MB to GBs this can take time, as in 
>>> many
>>>seconds, which in the end annoy users. I would say with what I do have
>>>working well in Parts that the data storage, retrieval is the big time
>>>suck. Addressing this would have the largest impact me.
>>>
>>> Process spawning:
>>>
>>>- I add this as

Re: [Scons-dev] SCons performance investigations

2017-07-23 Thread Jonathon Reinhart
I just wanted to add some quick anecdotes.  In some of our largest, most
complicated builds, we have observed a lot of the same things as you all
have.

One time we did some quick profiling, and saw that much CPU time during a
null build was spent in the variable substitution.

Additionally, we also have a habit of cloning the environment before
passing it to a SConscript. This is for safety - to ensure that a child
SConscript can't mess up the environment for its siblings.


Jonathon Reinhart

On Sat, Jul 22, 2017 at 5:23 PM, Bill Deegan 
wrote:

> Jason,
>
> Any chance you could add these comments to the wiki page?
> https://bitbucket.org/scons/scons/wiki/NeedForSpeed
>
> -Bill
>
> On Sat, Jul 22, 2017 at 10:09 AM, Jason Kenny  wrote:
>
>> Some additional thoughts
>>
>>
>>
>> Serial DAG traversal:
>>
>>- On the issue here as well is that the Dag for doing builds is based
>>on nodes. There is a bit of logic to deal with handing side effects and
>>build actions that have multiple outputs. Greg Noel had made a push for
>>something called TNG taskmaster. I understand now the main fix he was 
>> going
>>for is to tweak SCons to navigate a builder Dag instead of Node DAG, the
>>node Dag is great to get the main organization but after that it is
>>generally trivial to make a DAG based on builder at the same time,
>>Traversing this is much faster, we require less “special” logic and will 
>> be
>>easier to parallelize.
>>   - On big improvement this provides is that we only need to test if
>>   the sources or targets are out of date if the dependent builders are 
>> all up
>>   to date. If one of the is out of date, we just build, This vs we check 
>> each
>>   node and see if the build action has been done which requires extra 
>> scans
>>   and work in the current logic.
>>   - Given a builder is out of data you just mark all parents out of
>>   date. We only care about builders in a set that we don’t know are out 
>> of
>>   date yet. Simple tweaks on how we go through the tree can mean we only 
>> need
>>   to touch a few nodes.
>>
>> Start up time:
>>
>>- Zero build time is going to be the worse case for a build up to
>>date, as we have to make sure all items are in a good state. Time to start
>>building on diff should be a lot faster. Scons spends a lot of time having
>>to read everything on second passes. We can use our cache much better to
>>store states on what builds what, etc to avoid even having to read a file.
>>If the file did not change we already know the node/builder tree it will
>>provide. We already know the actions. We can start building items as soon
>>as a md5/time stamp check fails most of the time. Globs can store
>>information about what it read and processed and only need to go off when
>>we notice a directory timestamp. Avoiding processing build files and
>>loading known state is much faster than processing the python code. My 
>> work
>>in Parts has shown this. The trick is knowing when you might have to load 
>> a
>>file again to make sure custom logic get processed correctly.
>>- In the case of Parts it would be great to load file concurrently
>>and in parallel. I think I have a way to go this concurrently which I have
>>not done yet. The main issue is the node FS object tree is a sync point 
>> for
>>being parallel.
>>
>> CacheDir:
>>
>> 100% agree..
>>
>> SConsign generation:
>>
>>- I think this is a bigger deal for larger builds. I have found in
>>Parts, as I store more data I would try to break up the items into
>>different files. This helps, but in the end, at some point a pickle or 
>> JSON
>>dump takes times. It also takes time to load them as in cases for builds I
>>have had, loading 700mb files takes even the best systems a moment to do.
>>This is a big waste when I only need to get a little bit of data. 
>> Likewise,
>>the storing of the data could and should be happening as we build items. 
>> As
>>noted we don’t have a good way to store a single item without storing all
>>the file. If the file is large 100MB to GBs this can take time, as in many
>>seconds, which in the end annoy users. I would say with what I do have
>>working well in Parts that the data storage, retrieval is the big time
>>suck. Addressing this would have the largest impact me.
>>
>> Process spawning:
>>
>>- I add this as We had submitted a sub process fix for POSIX systems.
>>The code effect larger builds more than smaller builds because of forking
>>behavior. I don’t believe it been added to SCons as of yet.
>>- As a side design note, If we did make a multiprocessing setup for
>>SCons, This might be less of an issue, as the “process” workers only need
>>information about a build to run on. Changing of nodes state would have to
>>be synced with the main process via messages as t