Re: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build?
Andrew: This is great information. I will take look LoadToolDefFile first, then check parsing logic in C #include file. Thanks Liming -Original Message- From: af...@apple.com [mailto:af...@apple.com] Sent: Thursday, February 18, 2016 12:03 AM To: Gao, Liming Cc: edk2-devel@lists.01.org Subject: Re: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build? > On Feb 17, 2016, at 6:34 AM, Gao, Liming wrote: > > Andrew: > Thanks for this report. Tottime is second or microsecond? If LoadToolDefFile > is called only once, how much performance can be saved? > Liming, Python cProfile produces data in the pstats format and time is in seconds. So it looks like the parse of the file takes 0.19 seconds, but doing it 47 times is almost 9 seconds. If you look at https://github.com/tianocore/edk2/blob/master/BaseTools/Conf/tools_def.template you can see the file is 463 KB, so you end up parsing 21 GB of data every build. Note it is going to depend on the layout of the FVs for the platform as you need GUIDed encapsulation sections to see the performance hit. Thus me pruning the Conf/tools_def.txt down to a 10K file saved ~8 seconds of build time. My profile run was on a dirty build so none of the modules rebuilt. If you profile a clean build the stats change as a lot of time is spent parsing the C code for the #include infrastructure, and this shows up as spending time in regex functions. I seem to remember there was a regex function that ran for about 4+ seconds for example. You can also do this: Update RunToolFromSource: python -m cProfile -o edk2.profile "`dirname $0`/../../Source/Python"/`basename $0`/`basename $0`.py $* >From the command line: python -m pstats edk2.profile Welcome to the profile statistics browser. % help Documented commands (type help ): EOF add callees callers quit read reverse sort stats strip Undocumented commands: == help % This gives you an interactive prompt that lets you browse the stats, and track callees and callers. There are also optional packages that let you build call graphs, and do other interesting things on the pstats object. I've not tried those out. Thanks, Andrew Fish > Thanks > Liming > -Original Message- > From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Andrew > Fish > Sent: Tuesday, February 16, 2016 5:42 PM > To: edk2-devel@lists.01.org > Subject: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build? > > > I turned on profiling in Python: > BaseTools/BinWrappers/PosixLike/RunToolFromSource: > #!/usr/bin/env bash > #python `dirname $0`/RunToolFromSource.py `basename $0` $* > PYTHONPATH="`dirname $0`/../../Source/Python" \ >python -m cProfile -s time "`dirname $0`/../../Source/Python"/`basename > $0`/`basename $0`.py $* > > > The slowest item in the working thread was ToolDefClassObject. > LoadToolDefFile() > > Ordered by: internal time > > ncalls tottime percall cumtime percall filename:lineno(function) > 478.9320.190 10.9860.234 > ToolDefClassObject.py:60(LoadToolDefFile) > 40674781.3370.0001.3370.000 {method 'split' of 'str' objects} >285191.0640.0001.0640.000 {method 'execute' of > 'sqlite3.Cursor' objects} > 7333780.8260.0001.1610.000 StringIO.py:208(write) > 710.6130.0090.6130.009 {method 'items' of 'dict' > objects} > 230.4990.022 11.9050.518 > GuidSection.py:245(__FindExtendTool__) > > > I see it gets called at the beginning of the build from InitBuild() and that > makes sense. If I do a genmake build LoadToolDefFile() only gets called a > single time. > > But it looks like __FindExtendTool__ is also making calls: > ToolDefinition = > ToolDefClassObject.ToolDefDict(GenFdsGlobalVariable.ConfDir).ToolsDefTxtDictionary > > So it looks to me like class GuidSection() is parsing the tools_def.txt for > every GUID'ed section that has a GUID defined tool? How come this information > is not cached? > > Does anyone see a quick fix? > > Thanks, > > Andrew Fish > ___ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build?
> On Feb 17, 2016, at 6:34 AM, Gao, Liming wrote: > > Andrew: > Thanks for this report. Tottime is second or microsecond? If LoadToolDefFile > is called only once, how much performance can be saved? > Liming, Python cProfile produces data in the pstats format and time is in seconds. So it looks like the parse of the file takes 0.19 seconds, but doing it 47 times is almost 9 seconds. If you look at https://github.com/tianocore/edk2/blob/master/BaseTools/Conf/tools_def.template you can see the file is 463 KB, so you end up parsing 21 GB of data every build. Note it is going to depend on the layout of the FVs for the platform as you need GUIDed encapsulation sections to see the performance hit. Thus me pruning the Conf/tools_def.txt down to a 10K file saved ~8 seconds of build time. My profile run was on a dirty build so none of the modules rebuilt. If you profile a clean build the stats change as a lot of time is spent parsing the C code for the #include infrastructure, and this shows up as spending time in regex functions. I seem to remember there was a regex function that ran for about 4+ seconds for example. You can also do this: Update RunToolFromSource: python -m cProfile -o edk2.profile "`dirname $0`/../../Source/Python"/`basename $0`/`basename $0`.py $* >From the command line: python -m pstats edk2.profile Welcome to the profile statistics browser. % help Documented commands (type help ): EOF add callees callers quit read reverse sort stats strip Undocumented commands: == help % This gives you an interactive prompt that lets you browse the stats, and track callees and callers. There are also optional packages that let you build call graphs, and do other interesting things on the pstats object. I've not tried those out. Thanks, Andrew Fish > Thanks > Liming > -Original Message- > From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Andrew > Fish > Sent: Tuesday, February 16, 2016 5:42 PM > To: edk2-devel@lists.01.org > Subject: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build? > > > I turned on profiling in Python: > BaseTools/BinWrappers/PosixLike/RunToolFromSource: > #!/usr/bin/env bash > #python `dirname $0`/RunToolFromSource.py `basename $0` $* > PYTHONPATH="`dirname $0`/../../Source/Python" \ >python -m cProfile -s time "`dirname $0`/../../Source/Python"/`basename > $0`/`basename $0`.py $* > > > The slowest item in the working thread was ToolDefClassObject. > LoadToolDefFile() > > Ordered by: internal time > > ncalls tottime percall cumtime percall filename:lineno(function) > 478.9320.190 10.9860.234 > ToolDefClassObject.py:60(LoadToolDefFile) > 40674781.3370.0001.3370.000 {method 'split' of 'str' objects} >285191.0640.0001.0640.000 {method 'execute' of > 'sqlite3.Cursor' objects} > 7333780.8260.0001.1610.000 StringIO.py:208(write) > 710.6130.0090.6130.009 {method 'items' of 'dict' > objects} > 230.4990.022 11.9050.518 > GuidSection.py:245(__FindExtendTool__) > > > I see it gets called at the beginning of the build from InitBuild() and that > makes sense. If I do a genmake build LoadToolDefFile() only gets called a > single time. > > But it looks like __FindExtendTool__ is also making calls: > ToolDefinition = > ToolDefClassObject.ToolDefDict(GenFdsGlobalVariable.ConfDir).ToolsDefTxtDictionary > > So it looks to me like class GuidSection() is parsing the tools_def.txt for > every GUID'ed section that has a GUID defined tool? How come this information > is not cached? > > Does anyone see a quick fix? > > Thanks, > > Andrew Fish > ___ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build?
Andrew: Thanks for this report. Tottime is second or microsecond? If LoadToolDefFile is called only once, how much performance can be saved? Thanks Liming -Original Message- From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Andrew Fish Sent: Tuesday, February 16, 2016 5:42 PM To: edk2-devel@lists.01.org Subject: [edk2] Why is the tools_def.txt file parsed 47 times on my fds build? I turned on profiling in Python: BaseTools/BinWrappers/PosixLike/RunToolFromSource: #!/usr/bin/env bash #python `dirname $0`/RunToolFromSource.py `basename $0` $* PYTHONPATH="`dirname $0`/../../Source/Python" \ python -m cProfile -s time "`dirname $0`/../../Source/Python"/`basename $0`/`basename $0`.py $* The slowest item in the working thread was ToolDefClassObject. LoadToolDefFile() Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 478.9320.190 10.9860.234 ToolDefClassObject.py:60(LoadToolDefFile) 40674781.3370.0001.3370.000 {method 'split' of 'str' objects} 285191.0640.0001.0640.000 {method 'execute' of 'sqlite3.Cursor' objects} 7333780.8260.0001.1610.000 StringIO.py:208(write) 710.6130.0090.6130.009 {method 'items' of 'dict' objects} 230.4990.022 11.9050.518 GuidSection.py:245(__FindExtendTool__) I see it gets called at the beginning of the build from InitBuild() and that makes sense. If I do a genmake build LoadToolDefFile() only gets called a single time. But it looks like __FindExtendTool__ is also making calls: ToolDefinition = ToolDefClassObject.ToolDefDict(GenFdsGlobalVariable.ConfDir).ToolsDefTxtDictionary So it looks to me like class GuidSection() is parsing the tools_def.txt for every GUID'ed section that has a GUID defined tool? How come this information is not cached? Does anyone see a quick fix? Thanks, Andrew Fish ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
[edk2] Why is the tools_def.txt file parsed 47 times on my fds build?
I turned on profiling in Python: BaseTools/BinWrappers/PosixLike/RunToolFromSource: #!/usr/bin/env bash #python `dirname $0`/RunToolFromSource.py `basename $0` $* PYTHONPATH="`dirname $0`/../../Source/Python" \ python -m cProfile -s time "`dirname $0`/../../Source/Python"/`basename $0`/`basename $0`.py $* The slowest item in the working thread was ToolDefClassObject. LoadToolDefFile() Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 478.9320.190 10.9860.234 ToolDefClassObject.py:60(LoadToolDefFile) 40674781.3370.0001.3370.000 {method 'split' of 'str' objects} 285191.0640.0001.0640.000 {method 'execute' of 'sqlite3.Cursor' objects} 7333780.8260.0001.1610.000 StringIO.py:208(write) 710.6130.0090.6130.009 {method 'items' of 'dict' objects} 230.4990.022 11.9050.518 GuidSection.py:245(__FindExtendTool__) I see it gets called at the beginning of the build from InitBuild() and that makes sense. If I do a genmake build LoadToolDefFile() only gets called a single time. But it looks like __FindExtendTool__ is also making calls: ToolDefinition = ToolDefClassObject.ToolDefDict(GenFdsGlobalVariable.ConfDir).ToolsDefTxtDictionary So it looks to me like class GuidSection() is parsing the tools_def.txt for every GUID'ed section that has a GUID defined tool? How come this information is not cached? Does anyone see a quick fix? Thanks, Andrew Fish ___ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel