date:20171002

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Nick Coghlan

On 2 October 2017 at 15:13, Raymond Hettinger
 wrote:
>
>> On Oct 1, 2017, at 7:34 PM, Nathaniel Smith  wrote:
>>
>> In principle re.compile() itself could be made lazy -- return a
>> regular exception object that just holds the string, and then compiles
>> and caches it the first time it's used. Might be tricky to do in a
>> backwards compatibility way if it moves detection of invalid regexes
>> from compile time to use time, but it could be an opt-in flag.
>
> ISTM that someone writing ``re.compile(pattern)`` is explicitly saying they 
> want the regex to be pre-compiled.   For cache on first-use, we already have 
> a way to do that with ``re.search(pattern, some string)`` which compiles and 
> then caches.
>
> What would be more interesting would be to have a way to save the compiled 
> regex in a pyc file so that it can be restored on load rather than recomputed.
>
> Also, we should remind ourselves that making more and more things lazy is a 
> false optimization unless those things never get used.  Otherwise, all we're 
> doing is ending the timing before all the relevant work is done. If the lazy 
> object does get used, we've made the actual total execution time worse 
> (because of the overhead of the lazy evaluation logic).

Right, and I think the approach Inada-san took here is a good example
of how to do that effectively (there are a lot of command line scripts
and other startup-sensitive operations that will include an "import
requests", but *not* directly import any of the other modules in its
dependency tree, so "What requests uses" can identify a useful set of
avoidable imports. A Flask "Hello world" app could likely provide
another such sample, as could some example data analysis notebooks).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Paul Moore

On 2 October 2017 at 06:13, Raymond Hettinger
 wrote:
>
>> On Oct 1, 2017, at 7:34 PM, Nathaniel Smith  wrote:
>>
>> In principle re.compile() itself could be made lazy -- return a
>> regular exception object that just holds the string, and then compiles
>> and caches it the first time it's used. Might be tricky to do in a
>> backwards compatibility way if it moves detection of invalid regexes
>> from compile time to use time, but it could be an opt-in flag.
>
> ISTM that someone writing ``re.compile(pattern)`` is explicitly saying they 
> want the regex to be pre-compiled.   For cache on first-use, we already have 
> a way to do that with ``re.search(pattern, some string)`` which compiles and 
> then caches.

In practice, I don't think the fact that re.search() et al cache the
compiled expressions is that well known (it's mentioned in the
re.compile docs, but not in the re.search docs) and so people often
compile up front because they think it helps, rather than actually
measuring to check. Also, many regexes are long and complex, so
factoring them out as global variables is a reasonable practice. And
it's easy to imagine people deciding that putting the re.compile step
into the global, rather than having the global be a string that gets
passed to re.search, is a sensible thing to do (I know I'd do that,
without even thinking about it).

So I think that cache on first use is likely to be a useful
optimisation in practical terms. I don't have any feel for how many
uses of re.compile up front would be harmed if we defer compilation to
first use (other than "probably not many") but we could make it opt-in
if necessary - we'd hit the same problem of people not thinking to opt
in, though.

Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Christian Heimes

On 2017-10-02 04:04, INADA Naoki wrote:
> *3. ssl*
> 
> import time:      2007 |       2007 |                     ipaddress
> import time:      2386 |       2386 |                     textwrap
> import time:      2723 |       2723 |                     _ssl
> ...
> import time:       306 |        988 |                     base64
> import time:      2902 |      11004 |                   ssl
> 
> I already created pull request about removing textwrap dependency from ssl.
> https://github.com/python/cpython/pull/3849

Thanks for the patch. I left a comment on the PR. Please update your
patch and give me a chance to review patches next time.

> ipaddress and _ssl module are bit slow too.  But I don't know we can
> improve them or not.

The _ssl extension module has to initialize OpenSSL. It is expected to
take a while. For 3.7 I'll replace ssl.match_hostname with OpenSSL
function. The ssl module will no longer depend on re and ipaddress module.

> ssl itself took 2.9 ms.  It's because ssl has six enums.

Why are enums so slow?

Christian


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Raymond Hettinger

> On Oct 2, 2017, at 12:39 AM, Nick Coghlan  wrote:
> 
>  "What requests uses" can identify a useful set of
> avoidable imports. A Flask "Hello world" app could likely provide
> another such sample, as could some example data analysis notebooks).

Right.  It is probably worthwhile to identify which parts of the library are 
typically imported but are not ever used.  And likewise, identify a core set of 
commonly used tools that are going to be almost unavoidable in sufficiently 
interesting applications (like using requests to access a REST API, running a 
micro-webframework, or invoking mercurial). 

Presumably, if any of this is going to make a difference to end users, we need 
to see if there is any avoidable work that takes a significant fraction of the 
total time from invocation through the point where the user first sees 
meaningful output.  That would include loading from nonvolatile storage, 
executing the various imports, and doing the actual application.

I don't expect to find anything that would help users of Django, Flask, and 
Bottle since those are typically long-running apps where we value response time 
more than startup time.

For scripts using the requests module, there will be some fruit because not 
everything that is imported is used.  However, that may not be significant 
because scripts using requests tend to be I/O bound.  In the timings below, 6% 
of the running time is used to load and run python.exe, another 16% is used to 
import requests, and the remaining 78% is devoted to the actual task of running 
a simple REST API query. It would be interesting to see how much of the 16% 
could be avoided without major alterations to requests, to urllib3, and to the 
standard library.

For mercurial, "hg log" or "hg commit" will likely be instructive about what 
portion of the imports actually get used.  A push or pull will likely be I/O 
bound so those commands are less informative.

Raymond

- Quick timing for a minimal script using the requests module 
---

$ cat > demo_github_rest_api.py
import requests
info = requests.get('https://api.github.com/users/raymondh').json()
print('%(name)s works at %(company)s. Contact at %(email)s' % info)

$ time python3.6 demo_github_rest_api.py
Raymond Hettinger works at SauceLabs. Contact at None

real0m0.561s
user0m0.134s
sys 0m0.018s

$ time python3.6 -c "import requests"

real0m0.125s
user0m0.104s
sys 0m0.014s

$ time python3.6 -c ""

real0m0.036s
user0m0.024s
sys 0m0.005s

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Christian Heimes

Hello python-dev,

it's great to see that so many developers are working on speeding up
Python's startup. The improvements are going to make Python more
suitable for command line scripts. However I'm worried that some
approaches are going to make other use cases slower and less efficient.
I'm talking about downsides of lazy initialization and deferred imports.


For short running command line scripts, lazy initialization of regular
expressions and deferred import of rarely used modules can greatly
reduce startup time and reduce memory usage.


For long running processes, deferring imports and initialization can be
a huge performance problem. A typical server application should
initialize as much as possible at startup and then signal its partners
that it is ready to serve requests. A deferred import of a module is
going to slow down the first request that happens to require the module.
This is unacceptable for some applications, e.g. Raymond's example of
speed trading.

It's even worse for forking servers. A forking HTTP server handles each
request in a forked child. Each child process has to compile a lazy
regular expression or important a deferred module over and over.
uWSGI's emperor / vassal mode us a pre-fork model with multiple server
processes to efficiently share memory with copy-on-write semantics. Lazy
imports will make the approach less efficient and slow down forking of
new vassals.


TL;DR please refrain from moving imports into functions or implementing
lazy modes, until we have figured out how to satisfy requirements of
both scripts and long running services. We probably need a PEP...

Christian
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread INADA Naoki

Hi.

My company is using Python for web service.
So I understand what you're worrying.
I'm against fine grained, massive lazy loading too.

But I think we're careful enough for lazy importing.

https://github.com/python/cpython/pull/3849
In this PR, I stop using textwrap entirely, instead of lazy import.

https://github.com/python/cpython/pull/3796
In this PR, lazy loading only happens when uuid1 is used.
But uuid1 is very uncommon for nowdays.

https://github.com/python/cpython/pull/3757
In this PR, singledispatch is lazy loading types and weakref.
But singledispatch is used as decorator.
So if web application uses singledispatch, it's loaded before preforking.

https://github.com/python/cpython/pull/1269
In this PR, there are some lazy imports.
But the number of lazy imports seems small enough.

I don't think we're going to too aggressive.

In case of regular expression, we're about starting discussion.
No real changes are made yet.

For example, tokenize.py has large regular expressions.
But most of web application uses only one of them: linecache.py uses
tokenize.open(), and it uses regular expression for encoding cookie.
(Note that traceback is using linecache.  It's very commonly imported.)

So 90% of time and memory for importing tokenize is just a waste not
only CLI application, but also web applications.
I have not create PR to lazy importing linecache or tokenize, because
I'm worrying about "import them at first traceback".

I feel Go's habit helps in some cases; "A little copying is better than a
little dependency."
(https://go-proverbs.github.io/ )
Maybe, copying `tokenize.open()` into linecache is better than lazy loading
tokenize.

Anyway, I completely agree with you; we should careful enough about lazy
(importing | compiling).

Regards,

On Mon, Oct 2, 2017 at 6:47 PM Christian Heimes 
wrote:

> Hello python-dev,
>
> it's great to see that so many developers are working on speeding up
> Python's startup. The improvements are going to make Python more
> suitable for command line scripts. However I'm worried that some
> approaches are going to make other use cases slower and less efficient.
> I'm talking about downsides of lazy initialization and deferred imports.
>
>
> For short running command line scripts, lazy initialization of regular
> expressions and deferred import of rarely used modules can greatly
> reduce startup time and reduce memory usage.
>
>
> For long running processes, deferring imports and initialization can be
> a huge performance problem. A typical server application should
> initialize as much as possible at startup and then signal its partners
> that it is ready to serve requests. A deferred import of a module is
> going to slow down the first request that happens to require the module.
> This is unacceptable for some applications, e.g. Raymond's example of
> speed trading.
>
> It's even worse for forking servers. A forking HTTP server handles each
> request in a forked child. Each child process has to compile a lazy
> regular expression or important a deferred module over and over.
> uWSGI's emperor / vassal mode us a pre-fork model with multiple server
> processes to efficiently share memory with copy-on-write semantics. Lazy
> imports will make the approach less efficient and slow down forking of
> new vassals.
>
>
> TL;DR please refrain from moving imports into functions or implementing
> lazy modes, until we have figured out how to satisfy requirements of
> both scripts and long running services. We probably need a PEP...
>
> Christian
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
-- 
Inada Naoki 
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Victor Stinner

2017-10-02 13:10 GMT+02:00 INADA Naoki :
> https://github.com/python/cpython/pull/3796
> In this PR, lazy loading only happens when uuid1 is used.
> But uuid1 is very uncommon for nowdays.

Antoine Pitrou added a new C extension _uuid which is imported as soon
as uuid(.py) is imported. On Linux at least, the main "overhead" is
still done on "import uuid". But Antoine change optimized a lot
"import uuid" import time!

> https://github.com/python/cpython/pull/3757
> In this PR, singledispatch is lazy loading types and weakref.
> But singledispatch is used as decorator.
> So if web application uses singledispatch, it's loaded before preforking.

While "import module" is fast, maybe we should use sometimes a global
variable to cache the import.

module = None
def func():
   global module
   if module is None: import module
   ...

I'm not sure that it's possible to write an helper for such pattern.

In *this case*, it's ok, since @singledispatch is more designed to be
used with top-level functions, not on nested functions. So the
overhead is only at startup, not at runtime in practice.

> Maybe, copying `tokenize.open()` into linecache is better than lazy loading
> tokenize.

Please don't copy code, only do that if we have no other choice.

> Anyway, I completely agree with you; we should careful enough about lazy
> (importing | compiling).

I think that most core devs are aware of tradeoffs and we try to find
a compromise on a case by case basis.

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)

2017-10-02 Thread Koos Zevenhoven

On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum  wrote:

> On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven  wrote:
>
>> On Oct 1, 2017 19:26, "Guido van Rossum"  wrote:
>>
>> Your PEP is currently incomplete. If you don't finish it, it is not even
>> a contender. But TBH it's not my favorite anyway, so you could also just
>> withdraw it.
>>
>>
>> I can withdraw it if you ask me to, but I don't want to withdraw it
>> without any reason. I haven't changed my mind about the big picture. OTOH,
>> PEP 521 is elegant and could be used to implement PEP 555, but 521 is
>> almost certainly less performant and has some problems regarding context
>> manager wrappers that use composition instead of inheritance.
>>
>
> It is my understanding that PEP 521 (which proposes to add optional
> __suspend__ and __resume__ methods to the context manager protocol, to be
> called whenever a frame is suspended or resumed inside a `with` block) is
> no longer a contender because it would be way too slow. I haven't read it
> recently or thought about it, so I don't know what the second issue you
> mention is about (though it's presumably about the `yield` in a context
> manager implemented using a generator decorated with
> `@contextlib.contextmanager`).
>
>
Well, it's not completely unrelated to that. The problem I'm talking about
is perhaps most easily seen from a simple context manager wrapper that uses
composition instead of inheritance:

class Wrapper:
def __init__(self):
self._wrapped = SomeContextManager()

def __enter__(self):
print("Entering context")
return self._wrapped.__enter__()

def __exit__(self):
self._wrapped.__exit__()
print("Exited context")


Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__
and __resume__, the Wrapper class is broken, because it does not respect
__suspend__ and __resume__. So actually this is a backwards compatiblity
issue.

But if the wrapper is made using inheritance, the problem goes away:


class Wrapper(SomeContextManager):
def __enter__(self):
print("Entering context")
return super().__enter__()

def __exit__(self):
super().__exit__()
print("Exited context")


Now the wrapper cleanly inherits the new optional __suspend__ and
__resume__ from the wrapped context manager type.


––Koos

>


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Guido van Rossum

On Sun, Oct 1, 2017 at 11:15 PM, Terry Reedy  wrote:

> On 10/2/2017 12:44 AM, Guido van Rossum wrote:
>
> - There's no rationale for the *args, **kwds part of the breakpoint()
>> signature. (I vaguely recall someone on the mailing list asking for it but
>> it seemed far-fetched at best.)
>>
>
> If IDLE's event-driven GUI debugger were rewritten to run in the user
> process, people wanting to debug a tkinter program should be able to pass
> in their root, with its mainloop, rather than having the debugger create
> its own, as it normally would.  Something else could come up.
>

But if they care so much, they could also use a small wrapper as the
sys.breakpointhook that retrieves the root and calls the IDLE debugger with
that. Why is adding the root to the breakpoint() call better than that? To
me, the main attraction for breakpoint is that there's something I can type
quickly and insert at any point in the code. During a debugging session I
may try setting it in many different places. If I have to also pass the
root each time I type "breakpoint()" that's just an unnecessary detail
compared to having it done automatically by the hook.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Christian Heimes

On 2017-10-02 14:05, George King wrote:
> I’m new to this issue, but curious: could the long-running server
> mitigate lazy loading problems simply by explicitly importing the
> deferred modules, e.g. at the top of __main__.py? It would require some
> performance tracing or other analysis to figure out what needed to be
> imported, but this might be a very easy way to win back response times
> for demanding applications. Conversely, small scripts currently have no
> recourse.

That approach could work, but I think that it is the wrong approach. I'd
rather keep Python optimized for long-running processes and introduce a
new mode / option to optimize for short-running scripts.

Christian
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)

2017-10-02 Thread Victor Stinner

Please start a new thread on python-dev. It's unrelated to
"deterministic pyc files".

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread George King

I’m new to this issue, but curious: could the long-running server mitigate lazy 
loading problems simply by explicitly importing the deferred modules, e.g. at 
the top of __main__.py? It would require some performance tracing or other 
analysis to figure out what needed to be imported, but this might be a very 
easy way to win back response times for demanding applications. Conversely, 
small scripts currently have no recourse.


> On Oct 2, 2017, at 7:10 AM, INADA Naoki  wrote:
> 
> Hi.
> 
> My company is using Python for web service.
> So I understand what you're worrying.
> I'm against fine grained, massive lazy loading too.
> 
> But I think we're careful enough for lazy importing.
> 
> https://github.com/python/cpython/pull/3849 
> 
> In this PR, I stop using textwrap entirely, instead of lazy import.
> 
> https://github.com/python/cpython/pull/3796 
> 
> In this PR, lazy loading only happens when uuid1 is used.
> But uuid1 is very uncommon for nowdays.
> 
> https://github.com/python/cpython/pull/3757 
> 
> In this PR, singledispatch is lazy loading types and weakref.
> But singledispatch is used as decorator.
> So if web application uses singledispatch, it's loaded before preforking.
> 
> https://github.com/python/cpython/pull/1269 
> 
> In this PR, there are some lazy imports.
> But the number of lazy imports seems small enough.
> 
> I don't think we're going to too aggressive.
> 
> In case of regular expression, we're about starting discussion.
> No real changes are made yet.
> 
> For example, tokenize.py has large regular expressions.
> But most of web application uses only one of them: linecache.py uses
> tokenize.open(), and it uses regular expression for encoding cookie.
> (Note that traceback is using linecache.  It's very commonly imported.)
> 
> So 90% of time and memory for importing tokenize is just a waste not
> only CLI application, but also web applications.
> I have not create PR to lazy importing linecache or tokenize, because
> I'm worrying about "import them at first traceback".
> 
> I feel Go's habit helps in some cases; "A little copying is better than a 
> little dependency."
> (https://go-proverbs.github.io/  )
> Maybe, copying `tokenize.open()` into linecache is better than lazy loading 
> tokenize.
> 
> 
> Anyway, I completely agree with you; we should careful enough about lazy 
> (importing | compiling).
> 
> Regards,
> 
> On Mon, Oct 2, 2017 at 6:47 PM Christian Heimes  > wrote:
> Hello python-dev,
> 
> it's great to see that so many developers are working on speeding up
> Python's startup. The improvements are going to make Python more
> suitable for command line scripts. However I'm worried that some
> approaches are going to make other use cases slower and less efficient.
> I'm talking about downsides of lazy initialization and deferred imports.
> 
> 
> For short running command line scripts, lazy initialization of regular
> expressions and deferred import of rarely used modules can greatly
> reduce startup time and reduce memory usage.
> 
> 
> For long running processes, deferring imports and initialization can be
> a huge performance problem. A typical server application should
> initialize as much as possible at startup and then signal its partners
> that it is ready to serve requests. A deferred import of a module is
> going to slow down the first request that happens to require the module.
> This is unacceptable for some applications, e.g. Raymond's example of
> speed trading.
> 
> It's even worse for forking servers. A forking HTTP server handles each
> request in a forked child. Each child process has to compile a lazy
> regular expression or important a deferred module over and over.
> uWSGI's emperor / vassal mode us a pre-fork model with multiple server
> processes to efficiently share memory with copy-on-write semantics. Lazy
> imports will make the approach less efficient and slow down forking of
> new vassals.
> 
> 
> TL;DR please refrain from moving imports into functions or implementing
> lazy modes, until we have figured out how to satisfy requirements of
> both scripts and long running services. We probably need a PEP...
> 
> Christian
> ___
> Python-Dev mailing list
> [email protected] 
> https://mail.python.org/mailman/listinfo/python-dev 
> 
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com 
> 
> -- 
> Inada Naoki mailto:[email protected]>>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Barry Warsaw

On Oct 2, 2017, at 10:48, Christian Heimes  wrote:
> 
> That approach could work, but I think that it is the wrong approach. I'd
> rather keep Python optimized for long-running processes and introduce a
> new mode / option to optimize for short-running scripts.

What would that look like, how would it be invoked, and how would that change 
the behavior of the interpreter?

-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Christian Heimes

On 2017-10-02 15:26, Victor Stinner wrote:
> 2017-10-02 13:10 GMT+02:00 INADA Naoki :
>> https://github.com/python/cpython/pull/3796
>> In this PR, lazy loading only happens when uuid1 is used.
>> But uuid1 is very uncommon for nowdays.
> 
> Antoine Pitrou added a new C extension _uuid which is imported as soon
> as uuid(.py) is imported. On Linux at least, the main "overhead" is
> still done on "import uuid". But Antoine change optimized a lot
> "import uuid" import time!
> 
>> https://github.com/python/cpython/pull/3757
>> In this PR, singledispatch is lazy loading types and weakref.
>> But singledispatch is used as decorator.
>> So if web application uses singledispatch, it's loaded before preforking.
> 
> While "import module" is fast, maybe we should use sometimes a global
> variable to cache the import.
> 
> module = None
> def func():
>global module
>if module is None: import module
>...
> 
> I'm not sure that it's possible to write an helper for such pattern.

I would rather like to see a function in importlib that handles deferred
imports:

modulename = importlib.deferred_import('modulename')

def deferred_import(name):
if name in sys.modules:
# special case 'None' here
return sys.modules[name]
else:
return ModuleProxy(name)

ModuleProxy is a module type subclass that loads the module on demand.

Christian
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Victor Stinner

2017-10-02 16:48 GMT+02:00 Christian Heimes :
> That approach could work, but I think that it is the wrong approach. I'd
> rather keep Python optimized for long-running processes and introduce a
> new mode / option to optimize for short-running scripts.

"Filling caches on demand" is an old pattern. I don't think that we
are doing anything new here.

If we add an opt-in option, I would prefer to have an option to
explicitly "fill caches", rather than the opposite.

I know another example of "lazy cache": base64.b85encode() fills a
cache at the first call.

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread George King

Fair, but can you justify your preference? From my perspective, I write many 
small command line scripts, and all of them would benefit from faster load 
time. Am I going to have to stick mode-setting incantations at the top of every 
single one? I occasionally write simple servers, and none of them would suffer 
for having the first request respond slightly slowly. In many cases they have 
slow first response times anyway due to file system warmup, etc.

> On Oct 2, 2017, at 10:48 AM, Christian Heimes  wrote:
> 
> On 2017-10-02 14:05, George King wrote:
>> I’m new to this issue, but curious: could the long-running server
>> mitigate lazy loading problems simply by explicitly importing the
>> deferred modules, e.g. at the top of __main__.py? It would require some
>> performance tracing or other analysis to figure out what needed to be
>> imported, but this might be a very easy way to win back response times
>> for demanding applications. Conversely, small scripts currently have no
>> recourse.
> 
> That approach could work, but I think that it is the wrong approach. I'd
> rather keep Python optimized for long-running processes and introduce a
> new mode / option to optimize for short-running scripts.
> 
> Christian

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Barry Warsaw

On Oct 1, 2017, at 22:34, Nathaniel Smith  wrote:
> 
> In principle re.compile() itself could be made lazy -- return a
> regular exception object that just holds the string, and then compiles
> and caches it the first time it's used. Might be tricky to do in a
> backwards compatibility way if it moves detection of invalid regexes
> from compile time to use time, but it could be an opt-in flag.

I already tried that experiment.  1) there are tricky corner cases; 2) nobody 
liked the change in semantics when re.compile() was made lazy.

https://bugs.python.org/issue31580
https://github.com/python/cpython/pull/3755

I think there are opportunities for an explicit API for lazy compilation of 
regular expressions, but I’m skeptical of the adoption curve making it 
worthwhile.  But maybe I’m wrong!

-Barry

signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Serhiy Storchaka


02.10.17 16:26, Victor Stinner пише:

While "import module" is fast, maybe we should use sometimes a global
variable to cache the import.

module = None
def func():
global module
if module is None: import module
...


I optimized "import module", and I think it can be optimized even more, 
up to making the above trick unnecessary. Currently there is an overhead 
of checking that the module found in sys.modules is not imported right now.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Gregory P. Smith

On Mon, Oct 2, 2017 at 8:03 AM Victor Stinner 
wrote:

> 2017-10-02 16:48 GMT+02:00 Christian Heimes :
> > That approach could work, but I think that it is the wrong approach. I'd
> > rather keep Python optimized for long-running processes and introduce a
> > new mode / option to optimize for short-running scripts.
>
> "Filling caches on demand" is an old pattern. I don't think that we
> are doing anything new here.
>
> If we add an opt-in option, I would prefer to have an option to
> explicitly "fill caches", rather than the opposite.
>

+1 the common case benefits from the laziness.

The much less common piece of code that needs to pre-initialize as much as
possible to avoid work happening at an inopportune future time (prior to
forking, while handling latency sensitive real time requests yet still
being written in CPython, etc.) knows its needs and can ask for it.

-gps
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Christian Heimes

On 2017-10-02 16:59, Barry Warsaw wrote:
> On Oct 2, 2017, at 10:48, Christian Heimes  wrote:
>>
>> That approach could work, but I think that it is the wrong approach. I'd
>> rather keep Python optimized for long-running processes and introduce a
>> new mode / option to optimize for short-running scripts.
> 
> What would that look like, how would it be invoked, and how would that change 
> the behavior of the interpreter?

I haven't given it much thought yet. Here are just some wild ideas:

- add '-l' command line option (l for lazy)
- in lazy mode, delay some slow operations (re compile, enum, ...)
- delay some imports in lazy mode, e.g. with a deferred import proxy

Christian

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Inheritance vs composition in backcompat (PEP521)

2017-10-02 Thread Koos Zevenhoven

Hi all, It was suggested that I start a new thread, because the other
thread drifted away from its original topic. So here, in case someone is
interested:

On Oct 2, 2017 17:03, "Koos Zevenhoven  wrote:

On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum  wrote:

On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven  wrote:

On Oct 1, 2017 19:26, "Guido van Rossum"  wrote:

Your PEP is currently incomplete. If you don't finish it, it is not even a
contender. But TBH it's not my favorite anyway, so you could also just
withdraw it.


I can withdraw it if you ask me to, but I don't want to withdraw it without
any reason. I haven't changed my mind about the big picture. OTOH, PEP 521
is elegant and could be used to implement PEP 555, but 521 is almost
certainly less performant and has some problems regarding context manager
wrappers that use composition instead of inheritance.


It is my understanding that PEP 521 (which proposes to add optional
__suspend__ and __resume__ methods to the context manager protocol, to be
called whenever a frame is suspended or resumed inside a `with` block) is
no longer a contender because it would be way too slow. I haven't read it
recently or thought about it, so I don't know what the second issue you
mention is about (though it's presumably about the `yield` in a context
manager implemented using a generator decorated with
`@contextlib.contextmanager`).


Well, it's not completely unrelated to that. The problem I'm talking about
is perhaps most easily seen from a simple context manager wrapper that uses
composition instead of inheritance:

class Wrapper:
def __init__(self):
self._wrapped = SomeContextManager()

def __enter__(self):
print("Entering context")
return self._wrapped.__enter__()

def __exit__(self):
self._wrapped.__exit__()
print("Exited context")


Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__
and __resume__, the Wrapper class is broken, because it does not respect
__suspend__ and __resume__. So actually this is a backwards compatiblity
issue.

But if the wrapper is made using inheritance, the problem goes away:


class Wrapper(SomeContextManager):
def __enter__(self):
print("Entering context")
return super().__enter__()

def __exit__(self):
super().__exit__()
print("Exited context")


Now the wrapper cleanly inherits the new optional __suspend__ and
__resume__ from the wrapped context manager type.


––Koos




-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Brett Cannon

On Mon, 2 Oct 2017 at 08:00 Christian Heimes  wrote:

> On 2017-10-02 15:26, Victor Stinner wrote:
> > 2017-10-02 13:10 GMT+02:00 INADA Naoki :
> >> https://github.com/python/cpython/pull/3796
> >> In this PR, lazy loading only happens when uuid1 is used.
> >> But uuid1 is very uncommon for nowdays.
> >
> > Antoine Pitrou added a new C extension _uuid which is imported as soon
> > as uuid(.py) is imported. On Linux at least, the main "overhead" is
> > still done on "import uuid". But Antoine change optimized a lot
> > "import uuid" import time!
> >
> >> https://github.com/python/cpython/pull/3757
> >> In this PR, singledispatch is lazy loading types and weakref.
> >> But singledispatch is used as decorator.
> >> So if web application uses singledispatch, it's loaded before
> preforking.
> >
> > While "import module" is fast, maybe we should use sometimes a global
> > variable to cache the import.
> >
> > module = None
> > def func():
> >global module
> >if module is None: import module
> >...
> >
> > I'm not sure that it's possible to write an helper for such pattern.
>
> I would rather like to see a function in importlib that handles deferred
> imports:
>
> modulename = importlib.deferred_import('modulename')
>
> def deferred_import(name):
> if name in sys.modules:
> # special case 'None' here
> return sys.modules[name]
> else:
> return ModuleProxy(name)
>
> ModuleProxy is a module type subclass that loads the module on demand.
>

My current design for an opt-in lazy importing setup includes an explicit
function for importlib that's mainly targeted for the stdlib and it's
startup module needs, but could be used by others:
https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb
.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Terry Reedy


On 10/2/2017 4:57 AM, Paul Moore wrote:


In practice, I don't think the fact that re.search() et al cache the
compiled expressions is that well known (it's mentioned in the
re.compile docs, but not in the re.search docs)


We could add redundant mentions in the functions ;-).


and so people often compile up front because they think it helps,
rather than actually measuring to check.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Barry Warsaw

Thanks for the review Guido!  The PEP and implementation have been updated to 
address your comments, but let me briefly respond here.

> On Oct 2, 2017, at 00:44, Guido van Rossum  wrote:

> - There's a comma instead of a period at the end of the 4th bullet in the 
> Rationale: "Breaking the idiom up into two lines further complicates the use 
> of the debugger,” .

Thanks, fixed.

> Also I don't understand how this complicates use

I’ve addressed that with some additional wording in the PEP.  Basically, it’s 
my contention that splitting it up on two lines introduces more opportunity for 
mistakes.

> TBH the biggest argument (to me) is that I simply don't know *how* I would 
> enter some IDE's debugger programmatically. I think it should also be pointed 
> out that even if an IDE has a way to specify conditional breakpoints, the UI 
> may be such that it's easier to just add the check to the code -- and then 
> the breakpoint() option is much more attractive than having to look up how 
> it's done in your particular IDE (especially since this is not all that 
> common).

This is a really excellent point!  I’ve reworked that section of the PEP to 
make this clear.

> - There's no rationale for the *args, **kwds part of the breakpoint() 
> signature. (I vaguely recall someone on the mailing list asking for it but it 
> seemed far-fetched at best.)

I’ve added some rationale.  The idea comes from optional `header` argument to 
IPython’s programmatic debugger API.  I liked that enough to add it to 
pdb.set_trace() for 3.7.  IPython accepts other optional arguments, so I think 
we do want to allow those to be passed through the call chain.  I expect any 
debugger’s advertised entry point to make these optional, so `breakpoint()` 
will always just work.

> - The explanation of the relationship between sys.breakpoint() and 
> sys.__breakpointhook__ was unclear to me

I think you understand it correctly, and I’ve hopefully clarified that in the 
PEP now, so you wouldn’t have to read the __displayhook__ (or __excepthook__) 
docs to understand how it works.

> - Some pseudo-code would be nice.

Great idea; added that to the PEP (pretty close to what you have, but with the 
warnings handling, etc.)

> I think something like `os.environ['PYTHONBREAKPOINT'] = 'foo.bar.baz'; 
> breakpoint()` should result in foo.bar.baz() being imported and called, right?

Correct.  Clarified in the PEP now.

> - I'm not quite sure what sort of fast-tracking for PYTHONBREAKPOINT=0 you 
> had in mind beyond putting it first in the code above.

That’s pretty close to it.  Clarified.

> - Did you get confirmation from other debuggers? E.g. does it work for IDLE, 
> Wing IDE, PyCharm, and VS 2015?

From some of them, yes.  Terry confirmed for IDLE, and I posted a statement in 
favor of the PEP from the PyCharm folks.  I’m pretty sure Steve confirmed that 
this would be useful for VS, and I haven’t heard from the Wing folks.

> - I'm not sure what the point would be of making a call to breakpoint() a 
> special opcode (it seems a lot of work for something of this nature). ISTM 
> that if some IDE modifies bytecode it can do whatever it well please without 
> a PEP.

I’m strongly against including anything related to a new bytecode to PEP 553; 
they’re just IMHO orthogonal issues, and I’ve removed this as an open issue for 
553.

I understand why some debugger developers want it though.  There was a talk at 
Pycon 2017 about what PyCharm does.  They have to rewrite the bytecode to 
insert a call to a “trampoline function” which in many ways is the equivalent 
of breakpoint() and sys.breakpointhook().  I.e. it’s a small function that sets 
up and calls a more complicated function to do the actual debugging.  IIRC, 
they open up space for 4 bytecodes, with all the fixups that implies.  The idea 
was that there could be a single bytecode that essentially calls builtin 
breakpoint().  Steve indicated that this might also be useful for VS.

There’s a fair bit that would have to be fleshed out to make this idea real, 
but as I say, I think it shouldn’t have anything to do with PEP 553, except 
that it could probably build on the APIs we’re adding here.

> - I don't see the point of calling `pdb.pm()` at breakpoint time. But it 
> could be done using the PEP with `import pdb; sys.breakpointhook = pdb.pm` 
> right? So this hardly deserves an open issue.

Correct, and I’ve removed this open issue.

> - I haven't read the actual implementation in the PR. A PEP should not depend 
> on the actual proposed implementation for disambiguation of its specification 
> (hence my proposal to add pseudo-code to the PEP).
> 
> That's what I have!

Cool, that’s very helpful, thanks!

-Barry

signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Christian Heimes

On 2017-10-02 19:29, Brett Cannon wrote:
> My current design for an opt-in lazy importing setup includes an
> explicit function for importlib that's mainly targeted for the stdlib
> and it's startup module needs, but could be used by others:
> https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb

Awesome, thanks Brett! :)

Small nit pick, you need to add a special case for blocked imports.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Brett Cannon

On Mon, 2 Oct 2017 at 02:43 Raymond Hettinger 
wrote:

>
> > On Oct 2, 2017, at 12:39 AM, Nick Coghlan  wrote:
> >
> >  "What requests uses" can identify a useful set of
> > avoidable imports. A Flask "Hello world" app could likely provide
> > another such sample, as could some example data analysis notebooks).
>
> Right.  It is probably worthwhile to identify which parts of the library
> are typically imported but are not ever used.  And likewise, identify a
> core set of commonly used tools that are going to be almost unavoidable in
> sufficiently interesting applications (like using requests to access a REST
> API, running a micro-webframework, or invoking mercurial).
>
> Presumably, if any of this is going to make a difference to end users, we
> need to see if there is any avoidable work that takes a significant
> fraction of the total time from invocation through the point where the user
> first sees meaningful output.  That would include loading from nonvolatile
> storage, executing the various imports, and doing the actual application.
>
> I don't expect to find anything that would help users of Django, Flask,
> and Bottle since those are typically long-running apps where we value
> response time more than startup time.
>
> For scripts using the requests module, there will be some fruit because
> not everything that is imported is used.  However, that may not be
> significant because scripts using requests tend to be I/O bound.  In the
> timings below, 6% of the running time is used to load and run python.exe,
> another 16% is used to import requests, and the remaining 78% is devoted to
> the actual task of running a simple REST API query. It would be interesting
> to see how much of the 16% could be avoided without major alterations to
> requests, to urllib3, and to the standard library.
>
> For mercurial, "hg log" or "hg commit" will likely be instructive about
> what portion of the imports actually get used.  A push or pull will likely
> be I/O bound so those commands are less informative.
>

So Mercurial specifically is an odd duck because they already do lazy
importing (in fact they are using the lazy loading support from importlib).
In terms of all of this discussion of tweaking import to be lazy, I think
the best approach would be providing an opt-in solution that CLI tools can
turn on ASAP while the default stays eager. That way everyone gets what
they want while the stdlib provides a shared solution that's maintained
alongside import itself to make sure it functions appropriately.

-Brett


>
>
> Raymond
>
>
> - Quick timing for a minimal script using the requests module
> ---
>
> $ cat > demo_github_rest_api.py
> import requests
> info = requests.get('https://api.github.com/users/raymondh').json()
> print('%(name)s works at %(company)s. Contact at %(email)s' % info)
>
> $ time python3.6 demo_github_rest_api.py
> Raymond Hettinger works at SauceLabs. Contact at None
>
> real0m0.561s
> user0m0.134s
> sys 0m0.018s
>
> $ time python3.6 -c "import requests"
>
> real0m0.125s
> user0m0.104s
> sys 0m0.014s
>
> $ time python3.6 -c ""
>
> real0m0.036s
> user0m0.024s
> sys 0m0.005s
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Brett Cannon

On Mon, 2 Oct 2017 at 11:19 Christian Heimes  wrote:

> On 2017-10-02 19:29, Brett Cannon wrote:
> > My current design for an opt-in lazy importing setup includes an
> > explicit function for importlib that's mainly targeted for the stdlib
> > and it's startup module needs, but could be used by others:
> >
> https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb
>
> Awesome, thanks Brett! :)
>

Well, I have to find the time to try and get this in for Python 3.7 (I'm
currently working with Barry on a pkg_resources replacement so there's a
queue :) .


>
> Small nit pick, you need to add a special case for blocked imports.
>

Added a note at the end of the notebook about needing to make sure that's
properly supported.

-Brett


>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)

2017-10-02 Thread Brett Cannon

On Sat, 30 Sep 2017 at 18:46 Benjamin Peterson  wrote:

> What do you mean by bytecode-specific APIs? The internal importlib ones?
>

There's that, but more specifically py_compile and compileall.

-Brett


>
> On Fri, Sep 29, 2017, at 09:38, Brett Cannon wrote:
> > BTW, if you find the bytecode-specific APIs are sub-par while trying to
> > update them, let me know as I have been toying with cleaning them up and
> > centralizing them under importlib for a while and just never gotten
> > around
> > to sitting down and coming up with a better design that warranted putting
> > the time into it. :)
> >
> > On Fri, 29 Sep 2017 at 09:17 Benjamin Peterson 
> > wrote:
> >
> > > Thanks, Guido and everyone who submitted feedback!
> > >
> > > I guess I know what I'll be doing this weekend.
> > >
> > >
> > > On Fri, Sep 29, 2017, at 08:18, Guido van Rossum wrote:
> > > > Let me try that again.
> > > >
> > > > There have been no further comments. PEP 552 is now accepted.
> > > >
> > > > Congrats, Benjamin! Go ahead and send your implementation for
> > > > review.Oops.
> > > > Let me try that again.
> > > >
> > > > PS. PEP 550 is still unaccepted, awaiting a new revision from Yury
> and
> > > > Elvis.
> > > >
> > > > --
> > > > --Guido van Rossum (python.org/~guido )
> > > > ___
> > > > Python-Dev mailing list
> > > > [email protected]
> > > > https://mail.python.org/mailman/listinfo/python-dev
> > > > Unsubscribe:
> > > >
> https://mail.python.org/mailman/options/python-dev/benjamin%40python.org
> > >
> > > ___
> > > Python-Dev mailing list
> > > [email protected]
> > > https://mail.python.org/mailman/listinfo/python-dev
> > > Unsubscribe:
> > > https://mail.python.org/mailman/options/python-dev/brett%40python.org
> > >
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Barry Warsaw

On Oct 2, 2017, at 14:56, Brett Cannon  wrote:

> So Mercurial specifically is an odd duck because they already do lazy 
> importing (in fact they are using the lazy loading support from importlib). 
> In terms of all of this discussion of tweaking import to be lazy, I think the 
> best approach would be providing an opt-in solution that CLI tools can turn 
> on ASAP while the default stays eager. That way everyone gets what they want 
> while the stdlib provides a shared solution that's maintained alongside 
> import itself to make sure it functions appropriately.

The problem I think is that to get full benefit of lazy loading, it has to be 
turned on globally for bare ‘import’ statements.  A typical application has 
tons of dependencies and all those libraries are also doing module global 
imports, so unless lazy loading somehow covers them, it’ll be an incomplete 
gain.  But of course it’ll take forever for all your dependencies to use 
whatever new API we come up with, and if it’s not as convenient to write as 
‘import foo’ then I suspect it won’t much catch on anyway.

-Barry

signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Terry Reedy


On 10/2/2017 10:45 AM, Guido van Rossum wrote:
On Sun, Oct 1, 2017 at 11:15 PM, Terry Reedy > wrote:


On 10/2/2017 12:44 AM, Guido van Rossum wrote:

- There's no rationale for the *args, **kwds part of the
breakpoint() signature. (I vaguely recall someone on the mailing
list asking for it but it seemed far-fetched at best.)


If IDLE's event-driven GUI debugger were rewritten to run in the
user process, people wanting to debug a tkinter program should be
able to pass in their root, with its mainloop, rather than having
the debugger create its own, as it normally would.  Something else
could come up.


But if they care so much, they could also use a small wrapper as the 
sys.breakpointhook that retrieves the root and calls the IDLE debugger 
with that.


'They' include beginners that need the simplicity of breakpoint() the most.

Why is adding the root to the breakpoint() call better than 
that? To me, the main attraction for breakpoint is that there's 
something I can type quickly and insert at any point in the code.


I agree.


During a debugging session
I may try setting it in many different places. If I 
have to also pass the root each time I type "breakpoint()" that's just 
an unnecessary detail compared to having it done automatically by the hook.


Even though pdb.set_trace re-initializes each call, idb.whatever should 
*not*. So it should set something that can be queried. My idea was that 
a person should pass root only on the first call.  But that founders on 
the fact that 'first call' may not be deterministic.


if cond:
breakpoint()


breakpoint()

Besides which, someone might insert breakpoint() before creating a root.

So I will try instead initializing with
iroot = tk._default_root if tk._default_root else tk.Tk()
and stick with iroot.update() and avoid i.mainloop()

A revised tk-based debugger, closer to pdb than it is now, will require 
some experimentation.  I would like to be able to use it to debug IDLE 
run from a command line, and that will be a fairly severe test of 
compatibility with a tkinter application.


You could approve breakpoint() without args now and add them if and when 
there were more convincing need.


--
Terry Jan Reedy


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Antoine Pitrou

On Mon, 02 Oct 2017 18:56:15 +
Brett Cannon  wrote:
> 
> So Mercurial specifically is an odd duck because they already do lazy
> importing (in fact they are using the lazy loading support from importlib).

Do they?  I was under the impression they had their own home-baked,
GPL-licensed, lazy-loading __import__ re-implementation.

At least they used to, perhaps they switched to something else
(probably still GPL-licensed, though).

Regards

Antoine.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Antoine Pitrou

On Mon, 2 Oct 2017 11:15:35 -0400
Barry Warsaw  wrote:
> 
> I think there are opportunities for an explicit API for lazy compilation of 
> regular expressions, but I’m skeptical of the adoption curve making it 
> worthwhile.  But maybe I’m wrong!

We already have two caching schemes available in the re module: one
explicit and eager with re.compile(), one implicit and lazy with
re.search() and friends.  I doubt we really need a third one :-)

Regards

Antoine.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Inheritance vs composition in backcompat (PEP521)

2017-10-02 Thread Guido van Rossum

On Mon, Oct 2, 2017 at 10:13 AM, Koos Zevenhoven  wrote:

> Hi all, It was suggested that I start a new thread, because the other
> thread drifted away from its original topic. So here, in case someone is
> interested:
>
> On Oct 2, 2017 17:03, "Koos Zevenhoven  wrote:
>
> On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum  wrote:
>
> On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven  wrote:
>
> On Oct 1, 2017 19:26, "Guido van Rossum"  wrote:
>
> Your PEP is currently incomplete. If you don't finish it, it is not even a
> contender. But TBH it's not my favorite anyway, so you could also just
> withdraw it.
>
>
> I can withdraw it if you ask me to, but I don't want to withdraw it
> without any reason. I haven't changed my mind about the big picture. OTOH,
> PEP 521 is elegant and could be used to implement PEP 555, but 521 is
> almost certainly less performant and has some problems regarding context
> manager wrappers that use composition instead of inheritance.
>
>
> It is my understanding that PEP 521 (which proposes to add optional
> __suspend__ and __resume__ methods to the context manager protocol, to be
> called whenever a frame is suspended or resumed inside a `with` block) is
> no longer a contender because it would be way too slow. I haven't read it
> recently or thought about it, so I don't know what the second issue you
> mention is about (though it's presumably about the `yield` in a context
> manager implemented using a generator decorated with
> `@contextlib.contextmanager`).
>
>
> Well, it's not completely unrelated to that. The problem I'm talking
> about is perhaps most easily seen from a simple context manager wrapper
> that uses composition instead of inheritance:
>
> class Wrapper:
> def __init__(self):
> self._wrapped = SomeContextManager()
>
> def __enter__(self):
> print("Entering context")
> return self._wrapped.__enter__()
>
> def __exit__(self):
> self._wrapped.__exit__()
> print("Exited context")
>
>
> Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__
> and __resume__, the Wrapper class is broken, because it does not respect
> __suspend__ and __resume__. So actually this is a backwards compatiblity
> issue.
>
>
Why is it backwards incompatible? I'd think that without PEP 521 it would
be broken in exactly the same way because there's no __suspend__/__resume__
at all.


> But if the wrapper is made using inheritance, the problem goes away:
>
>
> class Wrapper(SomeContextManager):
> def __enter__(self):
> print("Entering context")
> return super().__enter__()
>
> def __exit__(self):
> super().__exit__()
> print("Exited context")
>
>
> Now the wrapper cleanly inherits the new optional __suspend__ and
> __resume__ from the wrapped context manager type.
>
>
In any case this is completely academic because PEP 521 is not going to
happen. Nathaniel himself has said so (I think in the context of discussing
PEP 550).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Guido van Rossum

On Mon, Oct 2, 2017 at 11:02 AM, Barry Warsaw  wrote:

> Thanks for the review Guido!  The PEP and implementation have been updated
> to address your comments, but let me briefly respond here.
>
> > On Oct 2, 2017, at 00:44, Guido van Rossum  wrote:
>
> > - There's a comma instead of a period at the end of the 4th bullet in
> the Rationale: "Breaking the idiom up into two lines further complicates
> the use of the debugger,” .
>
> Thanks, fixed.
>
> > Also I don't understand how this complicates use
>
> I’ve addressed that with some additional wording in the PEP.  Basically,
> it’s my contention that splitting it up on two lines introduces more
> opportunity for mistakes.
>
> > TBH the biggest argument (to me) is that I simply don't know *how* I
> would enter some IDE's debugger programmatically. I think it should also be
> pointed out that even if an IDE has a way to specify conditional
> breakpoints, the UI may be such that it's easier to just add the check to
> the code -- and then the breakpoint() option is much more attractive than
> having to look up how it's done in your particular IDE (especially since
> this is not all that common).
>
> This is a really excellent point!  I’ve reworked that section of the PEP
> to make this clear.
>
> > - There's no rationale for the *args, **kwds part of the breakpoint()
> signature. (I vaguely recall someone on the mailing list asking for it but
> it seemed far-fetched at best.)
>
> I’ve added some rationale.  The idea comes from optional `header` argument
> to IPython’s programmatic debugger API.  I liked that enough to add it to
> pdb.set_trace() for 3.7.  IPython accepts other optional arguments, so I
> think we do want to allow those to be passed through the call chain.  I
> expect any debugger’s advertised entry point to make these optional, so
> `breakpoint()` will always just work.
>
> > - The explanation of the relationship between sys.breakpoint() and
> sys.__breakpointhook__ was unclear to me
>
> I think you understand it correctly, and I’ve hopefully clarified that in
> the PEP now, so you wouldn’t have to read the __displayhook__ (or
> __excepthook__) docs to understand how it works.
>
> > - Some pseudo-code would be nice.
>
> Great idea; added that to the PEP (pretty close to what you have, but with
> the warnings handling, etc.)
>
> > I think something like `os.environ['PYTHONBREAKPOINT'] = 'foo.bar.baz';
> breakpoint()` should result in foo.bar.baz() being imported and called,
> right?
>
> Correct.  Clarified in the PEP now.
>
> > - I'm not quite sure what sort of fast-tracking for PYTHONBREAKPOINT=0
> you had in mind beyond putting it first in the code above.
>
> That’s pretty close to it.  Clarified.
>
> > - Did you get confirmation from other debuggers? E.g. does it work for
> IDLE, Wing IDE, PyCharm, and VS 2015?
>
> From some of them, yes.  Terry confirmed for IDLE, and I posted a
> statement in favor of the PEP from the PyCharm folks.  I’m pretty sure
> Steve confirmed that this would be useful for VS, and I haven’t heard from
> the Wing folks.
>
> > - I'm not sure what the point would be of making a call to breakpoint()
> a special opcode (it seems a lot of work for something of this nature).
> ISTM that if some IDE modifies bytecode it can do whatever it well please
> without a PEP.
>
> I’m strongly against including anything related to a new bytecode to PEP
> 553; they’re just IMHO orthogonal issues, and I’ve removed this as an open
> issue for 553.
>
> I understand why some debugger developers want it though.  There was a
> talk at Pycon 2017 about what PyCharm does.  They have to rewrite the
> bytecode to insert a call to a “trampoline function” which in many ways is
> the equivalent of breakpoint() and sys.breakpointhook().  I.e. it’s a small
> function that sets up and calls a more complicated function to do the
> actual debugging.  IIRC, they open up space for 4 bytecodes, with all the
> fixups that implies.  The idea was that there could be a single bytecode
> that essentially calls builtin breakpoint().  Steve indicated that this
> might also be useful for VS.
>
> There’s a fair bit that would have to be fleshed out to make this idea
> real, but as I say, I think it shouldn’t have anything to do with PEP 553,
> except that it could probably build on the APIs we’re adding here.
>
> > - I don't see the point of calling `pdb.pm()` at breakpoint time. But
> it could be done using the PEP with `import pdb; sys.breakpointhook =
> pdb.pm` right? So this hardly deserves an open issue.
>
> Correct, and I’ve removed this open issue.
>
> > - I haven't read the actual implementation in the PR. A PEP should not
> depend on the actual proposed implementation for disambiguation of its
> specification (hence my proposal to add pseudo-code to the PEP).
> >
> > That's what I have!
>
> Cool, that’s very helpful, thanks!
>

I've seen your updates and it is now acceptable, except for *one* nit: in
builtins.breakpoint() the pseudo code raises RuntimeErro

Re: [Python-Dev] Inheritance vs composition in backcompat (PEP521)

2017-10-02 Thread Koos Zevenhoven

On Oct 3, 2017 00:02, "Guido van Rossum"  wrote:

On Mon, Oct 2, 2017 at 10:13 AM, Koos Zevenhoven  wrote:

> Hi all, It was suggested that I start a new thread, because the other
> thread drifted away from its original topic. So here, in case someone is
> interested:
>
> On Oct 2, 2017 17:03, "Koos Zevenhoven  wrote:
>
> On Mon, Oct 2, 2017 at 6:42 AM, Guido van Rossum  wrote:
>
> On Sun, Oct 1, 2017 at 1:52 PM, Koos Zevenhoven  wrote:
>
> On Oct 1, 2017 19:26, "Guido van Rossum"  wrote:
>
> Your PEP is currently incomplete. If you don't finish it, it is not even a
> contender. But TBH it's not my favorite anyway, so you could also just
> withdraw it.
>
>
> I can withdraw it if you ask me to, but I don't want to withdraw it
> without any reason. I haven't changed my mind about the big picture. OTOH,
> PEP 521 is elegant and could be used to implement PEP 555, but 521 is
> almost certainly less performant and has some problems regarding context
> manager wrappers that use composition instead of inheritance.
>
>
> It is my understanding that PEP 521 (which proposes to add optional
> __suspend__ and __resume__ methods to the context manager protocol, to be
> called whenever a frame is suspended or resumed inside a `with` block) is
> no longer a contender because it would be way too slow. I haven't read it
> recently or thought about it, so I don't know what the second issue you
> mention is about (though it's presumably about the `yield` in a context
> manager implemented using a generator decorated with
> `@contextlib.contextmanager`).
>
>
> Well, it's not completely unrelated to that. The problem I'm talking
> about is perhaps most easily seen from a simple context manager wrapper
> that uses composition instead of inheritance:
>
> class Wrapper:
> def __init__(self):
> self._wrapped = SomeContextManager()
>
> def __enter__(self):
> print("Entering context")
> return self._wrapped.__enter__()
>
> def __exit__(self):
> self._wrapped.__exit__()
> print("Exited context")
>
>
> Now, if the wrapped contextmanager becomes a PEP 521 one with __suspend__
> and __resume__, the Wrapper class is broken, because it does not respect
> __suspend__ and __resume__. So actually this is a backwards compatiblity
> issue.
>
>
Why is it backwards incompatible? I'd think that without PEP 521 it would
be broken in exactly the same way because there's no __suspend__/__resume__
at all.



The wrapper is (would be) broken because it depends on the internal
implementation of the wrapped CM.

Maybe the author of SomeContextManager wants to upgrade the CM to also work
in coroutines and generators. But it could be a more subtle change in the
CM implementation.

The problem becomes more serious and more obvious if you don't know which
context manager you are wrapping:

class Wrapper:
def __init__(self, contextmanager):
self._wrapped = contextmanager

def __enter__(self):
print("Entering context")
return self._wrapped.__enter__()

def __exit__(self):
self._wrapped.__exit__()
print("Exited context")


The wrapper is (would be) broken because it does not work for all CMs
anymore.


But if the wrapper is made using inheritance, the problem goes away:
>
>
> class Wrapper(SomeContextManager):
> def __enter__(self):
> print("Entering context")
> return super().__enter__()
>
> def __exit__(self):
> super().__exit__()
> print("Exited context")
>
>
> Now the wrapper cleanly inherits the new optional __suspend__ and
> __resume__ from the wrapped context manager type.
>
>
In any case this is completely academic because PEP 521 is not going to
happen. Nathaniel himself has said so (I think in the context of discussing
PEP 550).


I don't mind this (or Nathaniel ;-) being academic. The backwards
incompatibility issue I've just described applies to any extension via
composition, if the underlying type/protocol grows new members (like the CM
protocol would have gained __suspend__ and __resume__ in PEP521).


-- Koos (mobile)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Inheritance vs composition in backcompat (PEP521)

2017-10-02 Thread Guido van Rossum

 Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven  wrote

> I don't mind this (or Nathaniel ;-) being academic. The backwards
> incompatibility issue I've just described applies to any extension via
> composition, if the underlying type/protocol grows new members (like the CM
> protocol would have gained __suspend__ and __resume__ in PEP521).
>

Since you seem to have a good grasp on this issue, does PEP 550 suffer from
the same problem? (Or PEP 555, for that matter? :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Barry Warsaw

On Oct 2, 2017, at 17:36, Guido van Rossum  wrote:

> I've seen your updates and it is now acceptable, except for *one* nit: in 
> builtins.breakpoint() the pseudo code raises RuntimeError if 
> sys.breakpointhook is missing or None. OTOH sys.breakpointhook() just issues 
> a RuntimeWarning when something's wrong with the hook. Maybe 
> builtins.breakpoint() should also just warn if it can't find the hook? 
> Setting `sys.breakpointhook = None` might be the simplest way to 
> programmatically disable breakpoints. Why not allow it?

Oh, actually the pseudocode doesn’t match the C implementation exactly in this 
regard.  Currently the C implementation is more like:

def breakpoint(*args, **kws):
import sys
missing = object()
hook = getattr(sys, 'breakpointhook', missing)
if hook is missing:
raise RuntimeError('lost sys.breakpointhook')
return hook(*args, **kws)

The intent being, much like the other sys-hooks, that if 
PySys_GetObject("breakpointhook”) returns NULL, Something Bad Happened, so we 
have to set an error string and bail.  (PySys_GetObject() does not set an 
exception.)

E.g.

>>> def foo():
...   print('yes')
...   breakpoint()
...   print('no')
...
>>> del sys.breakpointhook
>>> foo()
yes
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 3, in foo
RuntimeError: lost sys.breakpointhook


Setting `sys.breakpoint = None` could be an interesting use case, but that’s 
not currently special in any way:

>>> sys.breakpointhook = None
>>> foo()
yes
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 3, in foo
TypeError: 'NoneType' object is not callable


I’m open to special-casing this if you think it’s useful.

(I’ll update the pseudocode in the PEP.)

Cheers,
-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Inheritance vs composition in backcompat (PEP521)

2017-10-02 Thread Koos Zevenhoven

On Oct 3, 2017 01:00, "Guido van Rossum"  wrote:

 Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven  wrote

I don't mind this (or Nathaniel ;-) being academic. The backwards
> incompatibility issue I've just described applies to any extension via
> composition, if the underlying type/protocol grows new members (like the CM
> protocol would have gained __suspend__ and __resume__ in PEP521).
>

Since you seem to have a good grasp on this issue, does PEP 550 suffer from
the same problem? (Or PEP 555, for that matter? :-)

Neither has this particular issue, because they don't extend an existing
protocol. If this thread has any significance, it will most likely be
elsewhere.

-- Koos (mobile)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Inheritance vs composition in backcompat (PEP521)

2017-10-02 Thread Koos Zevenhoven

On Oct 3, 2017 01:11, "Koos Zevenhoven"  wrote:

On Oct 3, 2017 01:00, "Guido van Rossum"  wrote:

 Mon, Oct 2, 2017 at 2:52 PM, Koos Zevenhoven  wrote

I don't mind this (or Nathaniel ;-) being academic. The backwards
> incompatibility issue I've just described applies to any extension via
> composition, if the underlying type/protocol grows new members (like the CM
> protocol would have gained __suspend__ and __resume__ in PEP521).
>

Since you seem to have a good grasp on this issue, does PEP 550 suffer from
the same problem? (Or PEP 555, for that matter? :-)



Neither has this particular issue, because they don't extend an existing
protocol. If this thread has any significance, it will most likely be
elsewhere.


That said, I did come across this thought while trying to find flaws in my
own PEP ;)

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Guido van Rossum

On Mon, Oct 2, 2017 at 3:03 PM, Barry Warsaw  wrote:

> On Oct 2, 2017, at 17:36, Guido van Rossum  wrote:
>
> > I've seen your updates and it is now acceptable, except for *one* nit:
> in builtins.breakpoint() the pseudo code raises RuntimeError if
> sys.breakpointhook is missing or None. OTOH sys.breakpointhook() just
> issues a RuntimeWarning when something's wrong with the hook. Maybe
> builtins.breakpoint() should also just warn if it can't find the hook?
> Setting `sys.breakpointhook = None` might be the simplest way to
> programmatically disable breakpoints. Why not allow it?
>
> Oh, actually the pseudocode doesn’t match the C implementation exactly in
> this regard.  Currently the C implementation is more like:
>
> def breakpoint(*args, **kws):
> import sys
> missing = object()
> hook = getattr(sys, 'breakpointhook', missing)
> if hook is missing:
> raise RuntimeError('lost sys.breakpointhook')
> return hook(*args, **kws)
>
> The intent being, much like the other sys-hooks, that if 
> PySys_GetObject("breakpointhook”)
> returns NULL, Something Bad Happened, so we have to set an error string and
> bail.  (PySys_GetObject() does not set an exception.)
>
> E.g.
>
> >>> def foo():
> ...   print('yes')
> ...   breakpoint()
> ...   print('no')
> ...
> >>> del sys.breakpointhook
> >>> foo()
> yes
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 3, in foo
> RuntimeError: lost sys.breakpointhook
>
>
> Setting `sys.breakpoint = None` could be an interesting use case, but
> that’s not currently special in any way:
>
> >>> sys.breakpointhook = None
> >>> foo()
> yes
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 3, in foo
> TypeError: 'NoneType' object is not callable
>
>
> I’m open to special-casing this if you think it’s useful.
>
> (I’ll update the pseudocode in the PEP.)
>

OK. That then concludes the review of your PEP. It is now accepted!
Congrats. I am looking forward to using the backport. :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 553

2017-10-02 Thread Barry Warsaw

On Oct 2, 2017, at 18:43, Guido van Rossum  wrote:
> 
> OK. That then concludes the review of your PEP. It is now accepted! Congrats. 
> I am looking forward to using the backport. :-)

Yay, thanks!  We’ll see if I can sneak that backport past Ned. :)

-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Ronald Oussoren

Op 3 okt. 2017 om 04:29 heeft Barry Warsaw  het volgende 
geschreven:

> On Oct 2, 2017, at 14:56, Brett Cannon  wrote:
> 
>> So Mercurial specifically is an odd duck because they already do lazy 
>> importing (in fact they are using the lazy loading support from importlib). 
>> In terms of all of this discussion of tweaking import to be lazy, I think 
>> the best approach would be providing an opt-in solution that CLI tools can 
>> turn on ASAP while the default stays eager. That way everyone gets what they 
>> want while the stdlib provides a shared solution that's maintained alongside 
>> import itself to make sure it functions appropriately.
> 
> The problem I think is that to get full benefit of lazy loading, it has to be 
> turned on globally for bare ‘import’ statements.  A typical application has 
> tons of dependencies and all those libraries are also doing module global 
> imports, so unless lazy loading somehow covers them, it’ll be an incomplete 
> gain.  But of course it’ll take forever for all your dependencies to use 
> whatever new API we come up with, and if it’s not as convenient to write as 
> ‘import foo’ then I suspect it won’t much catch on anyway.
> 

One thing to keep in mind is that imports can have important side-effects. 
Turning every import statement into a lazy import will not be backward 
compatible. 

Ronald
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-10-02 Thread Eric Snow

On Thu, Sep 14, 2017 at 8:44 PM, Nick Coghlan  wrote:
> Not really, because the only way to ensure object separation (i.e no
> refcounted objects accessible from multiple interpreters at once) with
> a bytes-based API would be to either:
>
> 1. Always copy (eliminating most of the low overhead communications
> benefits that subinterpreters may offer over multiple processes)
> 2. Make the bytes implementation more complicated by allowing multiple
> bytes objects to share the same underlying storage while presenting as
> distinct objects in different interpreters
> 3. Make the output on the receiving side not actually a bytes object,
> but instead a view onto memory owned by another object in a different
> interpreter (a "memory view", one might say)

4. Pass Bytes through directly.

The only problem of which I'm aware is that when Py_DECREF() triggers
Bytes.__del__(), it happens in the current interpreter, which may not
be the "owner" (i.e. allocated the object).  So the solution would be
to make PyBytesType.tp_free() effectively run as a "pending call"
under the owner.  This would require two things:

1. a new PyBytesObject.owner field (PyInterpreterState *), or a
separate owner table, which would be set when the object is passed
through a channel
2. a Py_AddPendingCall() that targets a specific interpreter (which I
expect would be desirable regardless)

Then, when the object has an owner, PyBytesType.tp_free() would add a
pending call on the owner to call PyObject_Del() on the Bytes object.

The catch is that currently "pending" calls (via Py_AddPendingCall)
are run only in the main thread of the main interpreter.  We'd need a
similar mechanism that targets a specific interpreter .

> By contrast, if we allow an actual bytes object to be shared, then
> either every INCREF or DECREF on that bytes object becomes a
> synchronisation point, or else we end up needing some kind of
> secondary per-interpreter refcount where the interpreter doesn't drop
> its shared reference to the original object in its source interpreter
> until the internal refcount in the borrowing interpreter drops to
> zero.

There shouldn't be a need to synchronize on INCREF.  If both
interpreters have at least 1 reference then either one adding a
reference shouldn't be a problem.  If only one interpreter has a
reference then the other won't be adding any references.  If neither
has a reference then neither is going to add any references.  Perhaps
I've missed something.  Under what circumstances would INCREF happen
while the refcount is 0?

On DECREF there shouldn't be a problem except possibly with a small
race between decrementing the refcount and checking for a refcount of
0.  We could address that several different ways, including allowing
the pending call to get queued only once (or being a noop the second
time).

FWIW, I'm not opposed to the CIV/memoryview approach, but want to make
sure we really can't use Bytes before going down that route.

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-10-02 Thread Eric Snow

After having looked it over, I'm leaning toward supporting buffering,
as well as not blocking by default.  Neither adds much complexity to
the implementation.

On Sat, Sep 23, 2017 at 5:45 AM, Antoine Pitrou  wrote:
> On Fri, 22 Sep 2017 19:09:01 -0600
> Eric Snow  wrote:
>> > send() blocking until someone else calls recv() is not only bad for
>> > performance,
>>
>> What is the performance problem?
>
> Intuitively, there must be some kind of context switch (interpreter
> switch?) at each send() call to let the other end receive the data,
> since you don't have any internal buffering.

There would be an internal size-1 buffer.

>> (FWIW, CSP
>> provides rigorous guarantees about deadlock detection (which Go
>> leverages), though I'm not sure how much benefit that can offer such a
>> dynamic language as Python.)
>
> Hmm... deadlock detection is one thing, but when detected you must still
> solve those deadlock issues, right?

Yeah, I haven't given much thought into how we could leverage that
capability but my
gut feeling is that we won't have much opportunity to do so. :)

>> I'm not sure I understand your concern here.  Perhaps I used the word
>> "sharing" too ambiguously?  By "sharing" I mean that the two actors
>> have read access to something that at least one of them can modify.
>> If they both only have read-only access then it's effectively the same
>> as if they are not sharing.
>
> Right.  What I mean is that you *can* share very simple "data" under
> the form of synchronization primitives.  You may want to synchronize
> your interpreters even they don't share user-visible memory areas.  The
> point of synchronization is not only to avoid memory corruption but
> also to regulate and orchestrate processing amongst multiple workers
> (for example processes or interpreters).  For example, a semaphore is
> an easy way to implement "I want no more than N workers to do this
> thing at the same time" ("this thing" can be something such as disk
> I/O).

I'm still not convinced that sharing synchronization primitives is
important enough to be worth including it in the PEP.  It can be added
later, or via an extension module in the meantime.  To that end, I'll
add a mechanism to the PEP for third-party types to indicate that they
can be passed through channels.  Something like
"obj.__channel_support__ = True".

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-10-02 Thread Eric Snow

On Mon, Oct 2, 2017 at 9:31 PM, Eric Snow  wrote:
> On DECREF there shouldn't be a problem except possibly with a small
> race between decrementing the refcount and checking for a refcount of
> 0.  We could address that several different ways, including allowing
> the pending call to get queued only once (or being a noop the second
> time).

Alternately, the channel could own a reference and DECREF it in the
owning interpreter once the refcount reaches 1.

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-10-02 Thread Eric Snow

On Mon, Sep 25, 2017 at 8:42 PM, Nathaniel Smith  wrote:
> It's fairly reasonable to implement a mutex using a CSP-style
> unbuffered channel (send = acquire, receive = release). And the same
> trick turns a channel with a fixed-size buffer into a bounded
> semaphore. It won't be as efficient as a modern specialized mutex
> implementation, of course, but it's workable.
>
> Unfortunately while technically you can construct a buffered channel
> out of an unbuffered channel, the construction's pretty unreasonable
> (it needs two dedicated threads per channel).

Yeah, if threading's synchronization primitives make sense between
interpreters then we'll add direct support.  Using channels for that
isn't a good option.

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-10-02 Thread Eric Snow

On Wed, Sep 27, 2017 at 1:26 AM, Nick Coghlan  wrote:
> It's also the case that unlike Go channels, which were designed from
> scratch on the basis of implementing pure CSP,

FWIW, Go's channels (and goroutines) don't implement pure CSP.  They
provide a variant that the Go authors felt was more in-line with the
language's flavor.  The channels in the PEP aim to support a more pure
implementation.

> Python has an
> established behavioural precedent in the APIs of queue.Queue and
> collections.deque: they're unbounded by default, and you have to opt
> in to making them bounded.

Right.  That's part of why I'm leaning toward support for buffered channels.

> While the article title is clickbaity,
> http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
> actually has a good discussion of this point. Search for "compose" to
> find the relevant section ("Channels don’t compose well with other
> concurrency primitives").
>
> The specific problem cited is that only offering unbuffered or
> bounded-buffer channels means that every send call becomes a potential
> deadlock scenario, as all that needs to happen is for you to be
> holding a different synchronisation primitive when the send call
> blocks.

Yeah, that blog post was a reference for me as I was designing the
PEP's channels.

> The fact that the proposal now allows for M:N sender:receiver
> relationships (just as queue.Queue does with threads) makes that
> problem worse, since you may now have variability not only on the
> message consumption side, but also on the message production side.
>
> Consider this example where you have an event processing thread pool
> that we're attempting to isolate from blocking IO by using channels
> rather than coroutines.
>
> Desired flow:
>
> 1. Listener thread receives external message from socket
> 2. Listener thread files message for processing on receive channel
> 3. Listener thread returns to blocking on the receive socket
>
> 4. Processing thread picks up message from receive channel
> 5. Processing thread processes message
> 6. Processing thread puts reply on the send channel
>
> 7. Sending thread picks up message from send channel
> 8. Sending thread makes a blocking network send call to transmit the message
> 9. Sending thread returns to blocking on the send channel
>
> When queue.Queue is used to pass the messages between threads, such an
> arrangement will be effectively non-blocking as long as the send rate
> is greater than or equal to the receive rate. However, the GIL means
> it won't exploit all available cores, even if we create multiple
> processing threads: you have to switch to multiprocessing for that,
> with all the extra overhead that entails.
>
> So I see the essential premise of PEP 554 as being to ask the question
> "If each of these threads was running its own *interpreter*, could we
> use Sans IO style protocols with interpreter channels to separate
> internally "synchronous" processing threads from separate IO threads
> operating at system boundaries, without having to make the entire
> application pervasively asynchronous?"

+1

> If channels are an unbuffered blocking primitive, then we don't get
> that benefit: even when there are additional receive messages to be
> processed, the processing thread will block until the previous send
> has completed. Switching the listener and sender threads over to
> asynchronous IO would help with that, but they'd also end up having to
> implement their own message buffering to manage the lack of buffering
> in the core channel primitive.
>
> By contrast, if the core channels are designed to offer an unbounded
> buffer by default, then you can get close-to-CSP semantics just by
> setting the buffer size to 1 (it's still not exactly CSP, since that
> has a buffer size of 0, but you at least get the semantics of having
> to alternate sending and receiving of messages).

Yep, I came to the same conclusion.

>> By the way, I do think efficiency is a concern here.  Otherwise
>> subinterpreters don't even have a point (just use multiprocessing).
>
> Agreed, and I think the interaction between the threading module and
> the interpreters module is one we're going to have to explicitly call
> out as being covered by the provisional status of the interpreters
> module, as I think it could be incredibly valuable to be able to send
> at least some threading objects through channels, and have them be an
> interpreter-specific reference to a common underlying sync primitive.

Agreed.  I'll add a note to the PEP.

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Make re.compile faster

2017-10-02 Thread INADA Naoki

Before deferring re.compile, can we make it faster?

I profiled `import string` and small optimization can make it 2x faster!
(but it's not backward compatible)

Before optimize:

import time: self [us] | cumulative | imported package
import time:  2339 |   9623 | string

string module took about 2.3 ms to import.

I found:

* RegexFlag.__and__ and __new__ is called very often.
* _optimize_charset is slow, because re.UNICODE | re.IGNORECASE

diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
index 144620c6d1..7c662247d4 100644
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -582,7 +582,7 @@ def isstring(obj):

 def _code(p, flags):

-flags = p.pattern.flags | flags
+flags = int(p.pattern.flags) | int(flags)
 code = []

 # compile info block
diff --git a/Lib/string.py b/Lib/string.py
index b46e60c38f..fedd92246d 100644
--- a/Lib/string.py
+++ b/Lib/string.py
@@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass):
 delimiter = '$'
 idpattern = r'[_a-z][_a-z0-9]*'
 braceidpattern = None
-flags = _re.IGNORECASE
+flags = _re.IGNORECASE | _re.ASCII

 def __init__(self, template):
 self.template = template

patched:
import time:  1191 |   8479 | string

Of course, this patch is not backward compatible. [a-z] doesn't match with
'ı' or 'ſ' anymore.
But who cares?

(in sre_compile.py)
# LATIN SMALL LETTER I, LATIN SMALL LETTER DOTLESS I
(0x69, 0x131), # iı
# LATIN SMALL LETTER S, LATIN SMALL LETTER LONG S
(0x73, 0x17f), # sſ

There are some other `re.I(GNORECASE)` options in stdlib. I'll check them.

More optimization can be done with implementing sre_parse and sre_compile
in C.
But I have no time for it in this year.

Regards,
-- 
Inada Naoki 
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Make re.compile faster

2017-10-02 Thread Serhiy Storchaka


03.10.17 06:29, INADA Naoki пише:

Before deferring re.compile, can we make it faster?

I profiled `import string` and small optimization can make it 2x faster!
(but it's not backward compatible)


Please open an issue for this.


I found:

* RegexFlag.__and__ and __new__ is called very often.
* _optimize_charset is slow, because re.UNICODE | re.IGNORECASE

diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
index 144620c6d1..7c662247d4 100644
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -582,7 +582,7 @@ def isstring(obj):

  def _code(p, flags):

-    flags = p.pattern.flags | flags
+    flags = int(p.pattern.flags) | int(flags)
      code = []

      # compile info block


Maybe cast flags to int earlier, in sre_compile.compile()?


diff --git a/Lib/string.py b/Lib/string.py
index b46e60c38f..fedd92246d 100644
--- a/Lib/string.py
+++ b/Lib/string.py
@@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass):
      delimiter = '$'
      idpattern = r'[_a-z][_a-z0-9]*'
      braceidpattern = None
-    flags = _re.IGNORECASE
+    flags = _re.IGNORECASE | _re.ASCII

      def __init__(self, template):
          self.template = template

patched:
import time:      1191 |       8479 | string

Of course, this patch is not backward compatible. [a-z] doesn't match 
with 'ı' or 'ſ' anymore.

But who cares?


This looks like a bug fix. I'm wondering if it is worth to backport it 
to 3.6. But the change itself can break a user code that changes 
idpattern without touching flags. There is other way, but it should be 
discussed on the bug tracker.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Make re.compile faster

2017-10-02 Thread Serhiy Storchaka


03.10.17 06:29, INADA Naoki пише:
More optimization can be done with implementing sre_parse and 
sre_compile in C.

But I have no time for it in this year.


And please don't do this! This would make maintaining the re module 
hard. The performance of the compiler is less important than correctness 
and performance of matching and searching.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

50 matches

Mail list logo