Re: [Python-Dev] To reduce Python "application" startup time
05.09.17 16:02, INADA Naoki пише: While I can't attend to sprint, I saw etherpad and I found Neil Schemenauer and Eric Snow will work on startup time. I want to share my current knowledge about startup time. For bare (e.g. `python -c pass`) startup time, I'm waiting C implementation of ABC. But application startup time is more important. And we can improve them with optimize importing common stdlib. Current `python -v` is not useful to optimize import. So I use this patch to profile import time. https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb With this profile, I tried optimize `python -c 'import asyncio'`, logging and http.client. https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch With this small patch: logging: 14.9ms -> 12.9ms asyncio: 62.1ms -> 58.2ms http.client: 43.8ms -> 36.1ms See also https://bugs.python.org/issue30152 which optimizes the import time of argparse using similar technique. I think these patches overlap. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
I should mention that I have a prototype design for improving importlib's lazy loading to be easier to turn on and use. See https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb for my current notes. Part of it includes an explicit lazy_import() function which would negate needing to hide imports in functions to delay their importation. On Wed, 6 Sep 2017 at 20:50 INADA Naoki wrote: > > I’m not sure however whether burying imports inside functions (as a kind > of poor man’s lazy import) is ultimately going to be satisfying. First, > it’s not natural, it generally violates coding standards (e.g. PEP 8), and > can make linters complain. > > Of course, I tried to move imports only when (1) it's only used one > or two of many functions in the module, > (2) it's relatively heavy, (3) it's rerely imported from other modules. > > > > Second, I think you’ll end up chasing imports all over the stdlib and > third party modules in any sufficiently complicated application. > > Agree. I won't use much time to optimization by moving import from > top to inner function in stdlib. > > I think my import-profiler patch can be polished and committed in Python > to help > library maintainers to know import time easily. (Maybe, `python -X > import-profile`) > > > > Third, I’m not sure that the gains you’ll get won’t just be overwhelmed > by lots of other things going on, such as pkg_resources entry point > processing, pth file processing, site.py effects, command line processing > libraries such as click, and implicitly added distribution exception hooks > (e.g. Ubuntu’s apport). > > Yes. I noticed some of them while profiling imports. > For example, old-style namespace package imports types module for > types.Module. > Types module imports functools, and functools imports collections. > So some efforts in CPython (Avoid importing collections and functools > from site) is not > worth enough when there are at least one old-style namespace package > is installed. > > > > > > Many of these can’t be blamed on Python itself, but all can contribute > significantly to Python’s apparent start up time. It’s definitely worth > investigating the details of Python import, and a few of us at the core > sprint have looked at those numbers and thrown around ideas for > improvement, but we’ll need to look at the effects up and down the stack to > improve the start up performance for the average Python application. > > > > Yes. I totally agree with you. That's why I use import-profile.patch > for some 3rd party libraries. > > Currently, I have these ideas to optimize application startup time. > > * Faster, or lazily compiling regular expression. (pkg_resources > imports pyparsing, which has lot regex) > * More usable lazy import. (which can be solved "PEP 549: Instance > Properties (aka: module properties)") > * Optimize enum creation. > * Faster namedtuple (There is pull request already) > * Faster ABC > * Breaking large import tree in stdlib. (PEP 549 may help this too) > > Regards, > > INADA Naoki > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
INADA Naoki wrote: > Current `python -v` is not useful to optimize import. > So I use this patch to profile import time. > https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb I have implemented DTrace probes that do almost the same thing. Your patch is better in that it does not require an OS with DTrace or SystemTap. The DTrace probes are better in that they can be a part of the standard Python build. https://github.com/nascheme/cpython/tree/dtrace-module-import DTrace script: https://gist.github.com/nascheme/c1cece36a3369926ee93cecc3d024179 Pretty printer for script output (very minimal): https://gist.github.com/nascheme/0bff5c49bb6b518f5ce23a9aea27f14b ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
> I’m not sure however whether burying imports inside functions (as a kind of > poor man’s lazy import) is ultimately going to be satisfying. First, it’s > not natural, it generally violates coding standards (e.g. PEP 8), and can > make linters complain. Of course, I tried to move imports only when (1) it's only used one or two of many functions in the module, (2) it's relatively heavy, (3) it's rerely imported from other modules. > Second, I think you’ll end up chasing imports all over the stdlib and third > party modules in any sufficiently complicated application. Agree. I won't use much time to optimization by moving import from top to inner function in stdlib. I think my import-profiler patch can be polished and committed in Python to help library maintainers to know import time easily. (Maybe, `python -X import-profile`) > Third, I’m not sure that the gains you’ll get won’t just be overwhelmed by > lots of other things going on, such as pkg_resources entry point processing, > pth file processing, site.py effects, command line processing libraries such > as click, and implicitly added distribution exception hooks (e.g. Ubuntu’s > apport). Yes. I noticed some of them while profiling imports. For example, old-style namespace package imports types module for types.Module. Types module imports functools, and functools imports collections. So some efforts in CPython (Avoid importing collections and functools from site) is not worth enough when there are at least one old-style namespace package is installed. > > Many of these can’t be blamed on Python itself, but all can contribute > significantly to Python’s apparent start up time. It’s definitely worth > investigating the details of Python import, and a few of us at the core > sprint have looked at those numbers and thrown around ideas for improvement, > but we’ll need to look at the effects up and down the stack to improve the > start up performance for the average Python application. > Yes. I totally agree with you. That's why I use import-profile.patch for some 3rd party libraries. Currently, I have these ideas to optimize application startup time. * Faster, or lazily compiling regular expression. (pkg_resources imports pyparsing, which has lot regex) * More usable lazy import. (which can be solved "PEP 549: Instance Properties (aka: module properties)") * Optimize enum creation. * Faster namedtuple (There is pull request already) * Faster ABC * Breaking large import tree in stdlib. (PEP 549 may help this too) Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
On Sep 6, 2017, at 00:42, INADA Naoki wrote: > Additionally, faster startup time (and smaller memory footprint) is good > for even Web applications. > For example, CGI is still comfortable tool sometimes. > Another example is GAE/Python. > > Anyway, I think researching import tree of popular library is good startline > about optimizing startup time. > For example, modules like ast and tokenize are imported often than I thought. Improving start up time may indeed help long running processes but start up costs will generally be amortized across the lifetime of the process, so it isn’t as noticeable. However, startup time *is* a real issue for command line tools. I’m not sure however whether burying imports inside functions (as a kind of poor man’s lazy import) is ultimately going to be satisfying. First, it’s not natural, it generally violates coding standards (e.g. PEP 8), and can make linters complain. Second, I think you’ll end up chasing imports all over the stdlib and third party modules in any sufficiently complicated application. Third, I’m not sure that the gains you’ll get won’t just be overwhelmed by lots of other things going on, such as pkg_resources entry point processing, pth file processing, site.py effects, command line processing libraries such as click, and implicitly added distribution exception hooks (e.g. Ubuntu’s apport). Many of these can’t be blamed on Python itself, but all can contribute significantly to Python’s apparent start up time. It’s definitely worth investigating the details of Python import, and a few of us at the core sprint have looked at those numbers and thrown around ideas for improvement, but we’ll need to look at the effects up and down the stack to improve the start up performance for the average Python application. Cheers, -Barry signature.asc Description: Message signed with OpenPGP ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
> Anyway, I think researching import tree of popular library is good > startline > about optimizing startup time. > I agree -- in this case, you've identified that asyncio is expensive -- good to know. In the jinja2 case, does it always need asyncio? Pep8 as side, I think it often makes sense for expensive optional imports to be done only if needed. Perhaps a patch to jinja2 is in order. CHB For example, modules like ast and tokenize are imported often than I > thought. > > Jinja2 is one of libraries I often use. I'm checking other libraries > like requests. > Thanks, > > INADA Naoki > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
On Wednesday, September 6, 2017, INADA Naoki wrote: > > How significant is application startup time to something that uses > > Jinja2? Are there short-lived programs that use it? Python startup > > time matters enormously to command-line tools like Mercurial, but far > > less to something that's designed to start up and then keep running > > (eg a web app, which is where Jinja is most used). > > Since Jinja2 is very popular template engine, it is used by CLI tools > like ansible. SaltStack uses Jinja2. It really is a good idea to regularly restart the minion processes. Celery can also cycle through worker processes, IIRC. > > Additionally, faster startup time (and smaller memory footprint) is good > for even Web applications. > For example, CGI is still comfortable tool sometimes. > Another example is GAE/Python. Short-lived processes are sometimes preferable from a security standpoint. Python is currently less viable for CGI use than other scripting languages due to startup time. Resource leaks (e.g. memory, file handles, database references; valgrind) do not last w/ short-lived CGI processes. If there's ASLR, that's also harder. Scale up operations with e.g. IaaS platforms like Kubernetes and PaaS platforms like AppScale all incur Python startup time on a regular basis. > > Anyway, I think researching import tree of popular library is good > startline > about optimizing startup time. > For example, modules like ast and tokenize are imported often than I > thought. > > Jinja2 is one of libraries I often use. I'm checking other libraries > like requests. > Thanks, > > INADA Naoki > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > wes.turner%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
> How significant is application startup time to something that uses > Jinja2? Are there short-lived programs that use it? Python startup > time matters enormously to command-line tools like Mercurial, but far > less to something that's designed to start up and then keep running > (eg a web app, which is where Jinja is most used). Since Jinja2 is very popular template engine, it is used by CLI tools like ansible. Additionally, faster startup time (and smaller memory footprint) is good for even Web applications. For example, CGI is still comfortable tool sometimes. Another example is GAE/Python. Anyway, I think researching import tree of popular library is good startline about optimizing startup time. For example, modules like ast and tokenize are imported often than I thought. Jinja2 is one of libraries I often use. I'm checking other libraries like requests. Thanks, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
On Wed, Sep 6, 2017 at 2:30 PM, INADA Naoki wrote: >>> This patch moves a few imports inside functions. I wonder whether that kind >>> of change actually helps with real applications—doesn't any real application >>> end up importing the socket module anyway at some point? >> >> I don't know if this particular change is worthwhile, but one place >> where startup slowness is particularly noticed is with commands like >> 'foo.py --help' or 'foo.py --generate-completions' (the latter called >> implicitly by hitting in some shell), which typically do lots of >> imports that end up not being used. >> > > Yes. And There are more worse scenario. > > 1. Jinja2 supports asyncio. So it imports asyncio. > 2. asyncio imports concurrent.futures, for compatibility with Future class. > 3. concurrent.futures package does > `from concurrent.futures.process import ProcessPoolExecutor` > 4. concurrent.futures.process package imports multiprocessing. > > So when I use Jinja2 but not asyncio or multiprocessing, I need to import > large dependency tree. > I want to make `import asyncio` dependency tree smaller. > > FYI, current version of Jinja2 has very large regex which took more than 100ms > when import time. It is fixed in master branch. So if you try to see > Jinja2, please > use master branch. How significant is application startup time to something that uses Jinja2? Are there short-lived programs that use it? Python startup time matters enormously to command-line tools like Mercurial, but far less to something that's designed to start up and then keep running (eg a web app, which is where Jinja is most used). ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
>> This patch moves a few imports inside functions. I wonder whether that kind >> of change actually helps with real applications—doesn't any real application >> end up importing the socket module anyway at some point? > > I don't know if this particular change is worthwhile, but one place > where startup slowness is particularly noticed is with commands like > 'foo.py --help' or 'foo.py --generate-completions' (the latter called > implicitly by hitting in some shell), which typically do lots of > imports that end up not being used. > Yes. And There are more worse scenario. 1. Jinja2 supports asyncio. So it imports asyncio. 2. asyncio imports concurrent.futures, for compatibility with Future class. 3. concurrent.futures package does `from concurrent.futures.process import ProcessPoolExecutor` 4. concurrent.futures.process package imports multiprocessing. So when I use Jinja2 but not asyncio or multiprocessing, I need to import large dependency tree. I want to make `import asyncio` dependency tree smaller. FYI, current version of Jinja2 has very large regex which took more than 100ms when import time. It is fixed in master branch. So if you try to see Jinja2, please use master branch. Regrads, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
>> With this profile, I tried optimize `python -c 'import asyncio'`, logging >> and http.client. >> >> >> https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch >> > This patch moves a few imports inside functions. I wonder whether that kind > of change actually helps with real applications—doesn't any real application > end up importing the socket module anyway at some point? > Ah, I'm sorry. It doesn't importing asyncio, logging and http.client faster. I saw pkg_resources. While it's not stdlib, it is imported very often. And it uses email.parser, but doesn't require socket or random. Since socket module creates some enums, removing it reduces few milliseconds. Regards, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
>> >> I haven't created pull request yet. >> (Can I create without issue, as trivial patch?) > > > Trivial, no-issue PRs are meant for things like typo fixes that need no > discussion or record. > > Moving imports in violation of the PEP 8 rule, "Imports are always put at > the top of the file, just after any module comments and docstrings, and > before module globals and constants", is not trivial. Doing so voluntarily > for speed, as opposed to doing so necessarily to avoid circular import > errors, is controversial. > > -- > Terry Jan Reedy > Make sense. I'll create issues for each module if it seems really worth enough. Thanks, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
On Tue, Sep 5, 2017 at 11:13 AM, Jelle Zijlstra wrote: > > > 2017-09-05 6:02 GMT-07:00 INADA Naoki : >> With this profile, I tried optimize `python -c 'import asyncio'`, logging >> and http.client. >> >> >> https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch >> > This patch moves a few imports inside functions. I wonder whether that kind > of change actually helps with real applications—doesn't any real application > end up importing the socket module anyway at some point? I don't know if this particular change is worthwhile, but one place where startup slowness is particularly noticed is with commands like 'foo.py --help' or 'foo.py --generate-completions' (the latter called implicitly by hitting in some shell), which typically do lots of imports that end up not being used. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
2017-09-05 6:02 GMT-07:00 INADA Naoki : > Hi, > > While I can't attend to sprint, I saw etherpad and I found > Neil Schemenauer and Eric Snow will work on startup time. > > I want to share my current knowledge about startup time. > > For bare (e.g. `python -c pass`) startup time, I'm waiting C > implementation of ABC. > > But application startup time is more important. And we can improve > them with optimize importing common stdlib. > > Current `python -v` is not useful to optimize import. > So I use this patch to profile import time. > https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb > > With this profile, I tried optimize `python -c 'import asyncio'`, logging > and http.client. > > https://gist.github.com/methane/1ab97181e74a33592314c7619bf342 > 33#file-0-optimize-import-patch > > This patch moves a few imports inside functions. I wonder whether that kind of change actually helps with real applications—doesn't any real application end up importing the socket module anyway at some point? > With this small patch: > > logging: 14.9ms -> 12.9ms > asyncio: 62.1ms -> 58.2ms > http.client: 43.8ms -> 36.1ms > > I haven't created pull request yet. > (Can I create without issue, as trivial patch?) > > I'm very busy these days, maybe until December. > But I hope this report helps people working on optimizing startup time. > > Regards, > > INADA Naoki > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jelle.zijlstra%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
On 9/5/2017 9:02 AM, INADA Naoki wrote: But application startup time is more important. And we can improve them with optimize importing common stdlib. Current `python -v` is not useful to optimize import. So I use this patch to profile import time. https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb With this profile, I tried optimize `python -c 'import asyncio'`, logging and http.client. https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch With this small patch: logging: 14.9ms -> 12.9ms asyncio: 62.1ms -> 58.2ms http.client: 43.8ms -> 36.1ms I haven't created pull request yet. (Can I create without issue, as trivial patch?) Trivial, no-issue PRs are meant for things like typo fixes that need no discussion or record. Moving imports in violation of the PEP 8 rule, "Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants", is not trivial. Doing so voluntarily for speed, as opposed to doing so necessarily to avoid circular import errors, is controversial. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] To reduce Python "application" startup time
On 5 September 2017 at 15:02, INADA Naoki wrote: > Hi, > > [...] > > For bare (e.g. `python -c pass`) startup time, I'm waiting C > implementation of ABC. > Hi, I am not sure I will be able to finish it this week, also this depends on fixing interactions with ABC caches in ``typing`` first (as I mentioned on b.p.o., currently ``typing`` "aggressively" uses private ABC API). -- Ivan ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] To reduce Python "application" startup time
Hi, While I can't attend to sprint, I saw etherpad and I found Neil Schemenauer and Eric Snow will work on startup time. I want to share my current knowledge about startup time. For bare (e.g. `python -c pass`) startup time, I'm waiting C implementation of ABC. But application startup time is more important. And we can improve them with optimize importing common stdlib. Current `python -v` is not useful to optimize import. So I use this patch to profile import time. https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb With this profile, I tried optimize `python -c 'import asyncio'`, logging and http.client. https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch With this small patch: logging: 14.9ms -> 12.9ms asyncio: 62.1ms -> 58.2ms http.client: 43.8ms -> 36.1ms I haven't created pull request yet. (Can I create without issue, as trivial patch?) I'm very busy these days, maybe until December. But I hope this report helps people working on optimizing startup time. Regards, INADA Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com