Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-24 Thread Franklin? Lee
On Fri, Sep 14, 2018 at 6:08 PM Larry Hastings  wrote:
> I can suggest that, based on conversation from Carl, that adding the stat 
> calls back in costs you half the startup.  So any mechanism where we're 
> talking to the disk _at all_ simply isn't going to be as fast.

Is that cost for when the stat calls are done in parallel with the new
loading mechanism?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Carl Shapiro
On Thu, Sep 20, 2018 at 11:20 PM, Stefan Behnel  wrote:

> What about the small integers cache? The C serialisation generates several
> PyLong objects that would normally reside in the cache. Is this handled
> somewhere? I guess the cache could entirely be loaded from the data
> segment. And the same would have to be done for interned strings. Basically
> anything that CPython only wants to have one instance of.
>

Un-marshaled immutable objects are tracked in a table to ensure their
uniqueness.  Thanks for mentioning the small integer cache.  It is not part
of the change, but it could be brought under this framework.  By doing so,
we could store the small integer objects instances in the data segment and
other data segment objects could reference those unique small integer
instances.

That would severely limit the application of this optimisation to external
> modules, though. I don't see a way how they could load their data
> structures from the data segment without duplicating all sorts of
> "singletons".


Yes, additional load-time work would have to be done to ensure the
uniqueness of those objects.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Stefan Behnel
Guido van Rossum schrieb am 21.09.2018 um 19:35:
> Though now I start worrying about interned strings. That's a concept that's
> a little closer to being a feature.

True. While there's the general '"ab"+"cd" is (not) "abcd"' caveat, I'm
sure quite a bit of code out there assumes that parsed identifiers in a
module, such as the names of functions and classes, are interned, since
this was often communicated. And in fact, explicitly interning the same
name might return a different string object with this change than what's in
the module/class dict.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Carl Shapiro
On Tue, Sep 18, 2018 at 3:00 PM, Neil Schemenauer 
wrote:

> The users of Python are pretty diverse so it depends on who you ask.
> Some would like a giant executable that includes everything they
> need (so of like the Go approach).  Other people want an executable
> that has just importlib inside it and then mix-and-match different
> shared libs for their different purposes.  Some will not want work
> "old school" and load from separate .py or .pyc files.
>
> I see no reason why we can't support all these options.
>

Supporting those options is possible if a some of our simplifying
assumptions are revisited.  Here are a few

We know about all the objects being stored in the data segment.  That makes
it is easy to ensure that immutable objects are unique.  Knowing anything
less, that work would have to be done at load-time.

We do not have to worry about the relocation cost of the pointers we add to
the data segment.  We are compiled into an executable that typically gets
loaded at a fixed address.  This could become a performance concern if we
wrote our data into a shared library.

Because we are compiled into the runtime, we do not have versioning
issues.  There is no possibility of PyObject_HEAD or any PyObject subclass
being changed out from under us.  The existing marshal format abstracts
away from these details but our format is very sensitive to the host
environment.

All of these problems have technical solutions.  They should be evaluated
carefully to ensure that the added overhead does not wipe-out the
performance wins or add lots of complexity to the runtime.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Guido van Rossum
On Fri, Sep 21, 2018 at 8:16 AM Christian Heimes 
wrote:

> On 21/09/2018 16.26, Guido van Rossum wrote:
> >> What about the small integers cache?
> >
> > I believe the small integers cache is only used to reduce the number of
> > objects -- I don't think there's any code (in CPython itself) that just
> > *assumes* that because an int is small it must be in the cache. So it
> > should be fine.
>
> Some places may assume that PyLong_FromLong() for a small int never
> fails. I certainly expect this in coverity scan modeling.
>

Ah, that goes in the other direction. That function will always return a
value from the cache if it's in range for the cache. and nothing change
there.

I was talking about situations where code might assume that if an object's
address is not that of the canonical cached zero-valued PyLong object, it
couldn't be a PyLong with value zero (same for other values in range of the
cache). I'd be very surprised if there was code assuming that, and I'd say
it was always wrong. (It's like beginners' code using 'x is 0' instead of
'x == 0'.)

Though now I start worrying about interned strings. That's a concept that's
a little closer to being a feature.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Jeethu Rao
> On Sep 21, 2018, at 06:53, Stefan Behnel  wrote:
> 
> Totally. This might actually be more relevant for Cython than for CPython
> in the end, because it wouldn't be limited to the stdlib and its core modules.
> 
> It's a bit more difficult for us, because this probably won't work easily
> across Python releases (2.[67] and 3.[45678] for now) and also definitely
> not for PyPy, but that just means some multiplication of the generated
> code, and we have the dynamic part of it already. Supporting that for
> Unicode strings will be fun, I'm sure. :)

I’m glad to hear that this might be relevant to Cython.
I believe it should be straightforward to parametrize the code generator to 
generate code targeted
at specific cPython versions. While we originally targeted 3.6, Larry Hastings 
managed to quickly port it to 3.8.
The two changes in cPython’s data structures between 3.6 and 3.8 that needed 
changes to the
code-gen were [1] from 3.7 and [2] from 3.8. And internally, I’ve still got a 
task open to back-port this to support 2.7.

-- Jeethu

[1]: https://bugs.python.org/issue18896
[2]: https://bugs.python.org/issue33597
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Christian Heimes
On 21/09/2018 16.26, Guido van Rossum wrote:
>> What about the small integers cache?
> 
> I believe the small integers cache is only used to reduce the number of
> objects -- I don't think there's any code (in CPython itself) that just
> *assumes* that because an int is small it must be in the cache. So it
> should be fine.

Some places may assume that PyLong_FromLong() for a small int never
fails. I certainly expect this in coverity scan modeling.

Christian

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Guido van Rossum
> What about the small integers cache?

I believe the small integers cache is only used to reduce the number of
objects -- I don't think there's any code (in CPython itself) that just
*assumes* that because an int is small it must be in the cache. So it
should be fine.

On Thu, Sep 20, 2018 at 11:23 PM Stefan Behnel  wrote:

> Larry Hastings schrieb am 14.09.2018 um 23:27:
> > What the patch does: it takes all the Python modules that are loaded as
> > part of interpreter startup and deserializes the marshalled .pyc file
> into
> > precreated objects stored as static C data.
>
> What about the small integers cache? The C serialisation generates several
> PyLong objects that would normally reside in the cache. Is this handled
> somewhere? I guess the cache could entirely be loaded from the data
> segment. And the same would have to be done for interned strings. Basically
> anything that CPython only wants to have one instance of.
>
> That would severely limit the application of this optimisation to external
> modules, though. I don't see a way how they could load their data
> structures from the data segment without duplicating all sorts of
> "singletons".
>
> Stefan
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Stefan Behnel
Larry Hastings schrieb am 14.09.2018 um 23:27:
> What the patch does: it takes all the Python modules that are loaded as
> part of interpreter startup and deserializes the marshalled .pyc file into
> precreated objects stored as static C data.

What about the small integers cache? The C serialisation generates several
PyLong objects that would normally reside in the cache. Is this handled
somewhere? I guess the cache could entirely be loaded from the data
segment. And the same would have to be done for interned strings. Basically
anything that CPython only wants to have one instance of.

That would severely limit the application of this optimisation to external
modules, though. I don't see a way how they could load their data
structures from the data segment without duplicating all sorts of "singletons".

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-20 Thread Stefan Behnel
Carl Shapiro schrieb am 20.09.2018 um 20:21:
> On Wed, Sep 19, 2018 at 12:32 AM, Stefan Behnel wrote:
> 
>> Also, one thing that would be interesting to find out is whether constant
>> Python data structures can actually be pre-allocated in the data segment
>> (and I mean their object structs) . Then things like tuples of strings
>> (argument lists and what not) could be loaded and the objects quickly
>> initialised (although, is that even necessary?), rather than having to heap
>> allocate and create them. Probably something that we should try out in
>> Cython.
> 
> I might not be fully understanding the scope of your question but this
> patch does allocate constant data structures in the data segment.  We could
> be more aggressive with that but we limit our scope to what is presented to
> the un-marshaling code.

Ah, thanks, yes, it works recursively, also for tuples and code objects.
Took me a while to figure out how to open the "frozemodules.c" file, but
looking at that makes it clear. Yes, that's what I meant.


> This may be relevant to Cython, as well.

Totally. This might actually be more relevant for Cython than for CPython
in the end, because it wouldn't be limited to the stdlib and its core modules.

It's a bit more difficult for us, because this probably won't work easily
across Python releases (2.[67] and 3.[45678] for now) and also definitely
not for PyPy, but that just means some multiplication of the generated
code, and we have the dynamic part of it already. Supporting that for
Unicode strings will be fun, I'm sure. :)

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-20 Thread Carl Shapiro
On Wed, Sep 19, 2018 at 12:32 AM, Stefan Behnel  wrote:

> Also, one thing that would be interesting to find out is whether constant
> Python data structures can actually be pre-allocated in the data segment
> (and I mean their object structs) . Then things like tuples of strings
> (argument lists and what not) could be loaded and the objects quickly
> initialised (although, is that even necessary?), rather than having to heap
> allocate and create them. Probably something that we should try out in
> Cython.
>

I might not be fully understanding the scope of your question but this
patch does allocate constant data structures in the data segment.  We could
be more aggressive with that but we limit our scope to what is presented to
the un-marshaling code.  This may be relevant to Cython, as well.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Eric V. Smith

On 9/19/2018 9:25 PM, Barry Warsaw wrote:

On Sep 19, 2018, at 20:34, Gregory P. Smith  wrote:


There's ongoing work to rewrite zipimport.c in python using zipfile itself


Great timing!  Serhiy’s rewrite of zipimport in Python has just landed in 3.8, 
although it doesn’t use zipfile.  What’s in git now is a pretty straightforward 
translation from the original C, so it could use some clean ups (and I think 
Serhiy is planning that).  So the problem you describe should be easier to fix 
now in 3.8.  It would be interesting to see if we can squeeze more performance 
and better behavior out of it now that it’s in Python.


You don't hear "better performance" and "now that it's in Python" 
together very often! Although I agree with your point: it's like how we 
tried and failed to make progress on namespace packages when import was 
written in C, and then once it was in Python it was easy to add the 
functionality.


Eric

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Barry Warsaw
On Sep 19, 2018, at 20:34, Gregory P. Smith  wrote:

> There's ongoing work to rewrite zipimport.c in python using zipfile itself

Great timing!  Serhiy’s rewrite of zipimport in Python has just landed in 3.8, 
although it doesn’t use zipfile.  What’s in git now is a pretty straightforward 
translation from the original C, so it could use some clean ups (and I think 
Serhiy is planning that).  So the problem you describe should be easier to fix 
now in 3.8.  It would be interesting to see if we can squeeze more performance 
and better behavior out of it now that it’s in Python.

-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Gregory P. Smith
On Sat, Sep 15, 2018 at 2:53 AM Paul Moore  wrote:

> On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer 
> wrote:
> >
> > On 2018-09-14, Larry Hastings wrote:
> > > [..] adding the stat calls back in costs you half the startup.  So
> > > any mechanism where we're talking to the disk _at all_ simply
> > > isn't going to be as fast.
> >
> > Okay, so if we use hundreds of small .pyc files scattered all over
> > the disk, that's bad?  Who would have thunk it. ;-P
> >
> > We could have a new format, .pya (compiled python archive) that has
> > data for many .pyc files in it.  In normal runs you would have one
> > or just and handlful of these things (e.g. one for stdlib, one for
> > your app and all the packages it uses).  Then you mmap these just
> > once and rely on OS page faults to bring in the data as you need it.
> > The .pya would have a hash table at the start or end that tells you
> > the offset for each module.
>
> Isn't that essentially what putting the stdlib in a zipfile does? (See
> the windows embedded distribution for an example). It probably uses
> normal IO rather than mmap, but maybe adding a "use mmap" flag to the
> zipfile module would be a more general enhancement that zipimport
> could use for free.
>
> Paul
>

To share a lesson learned: Putting the stdlib in a zip file is doable, but
comes with a caveats that would likely make OS distros want to undo the
change if done with CPython today:

We did that for one of our internal pre-built Python 2.7 distributions used
internally at Google used in the 2012-2014 timeframe.  Thinking at the time
"yay, less inodes and disk space and stat calls by the interpreter on all
machines."

The caveat we didn't anticipate was unfortunately that zipimport.c cannot
handle the zip file changing out from underneath a running process.  Ever.
It does not hold an open file handle to the zip file (which on posix
systems would ameliorate the problem) but instead regularly reopens it by
name while using a startup-time cached zip file index.  So when you deploy
a change to your Python interpreter (as any OS distro package update,
security update, upgrade, etc.) existing running processes that go on to do
another import of a stdlib module that hadn't already been imported
(statistically likely to be a codec related module, as those are often
imported upon first use rather than at startup time with most modules the
way people tend to structure their code) read a different zipfile using a
cached index from a previous one and... boom.  A strange rolling error in
production that is not pretty to debug.  Fixing zipimport.c to deal with
this properly was tried, but still ran into issues, and was deemed
ultimately infeasible.  There's a BPO issue or three filed about this if
you go hunting.

On the contrary, having compiled in constants in the executable is fine and
will never suffer from this problem.  Those are mapped as RO data by the
dynamic loader and demand paged.  No complicated code in CPython required
to manage them aside from the stdlib startup code import intercepting logic
(which should be reasonably small, even without having looked at the patch
in the PR yet).

There's ongoing work to rewrite zipimport.c in python using zipfile itself
which if used for the stdlib will require everything that it needs to be
frozen into C data similar to existing bootstrap import logic - and being a
different implementation of zip file reading code might be possible to do
without suffering the same caveat.  But storing the data on the C side
still sounds like a much simpler code path to me.

The maintenance concern is mostly about testing and building to make sure
we include everything needed by the interpreter and keep it up to date.
I'd like a configure flag controlling when the feature is to be "on by
default". Having it off by default and enabled by an interpreter command
line flag otherwise. Consider adding the individual configure flag to the
set of things that --with-optimizations turns on for people.

Don't be surprised if Facebook reports a startup time speedup greater than
what you ever measure yourself. Their applications are different, and if
they're using their XAR thing that mounts applications as a FUSE filesystem
- that increases stat() overhead beyond what it already is with additional
kernel round trips so it'll benefit that design even more.

Any savings in startup time by not doing a crazy amount of sequential high
latency blocking system calls is a good thing regardless.  Not just for
command line tools.  Serving applications that are starting up are
effectively spinning consuming CPUs to ultimately compute the same result
everywhere for every application every time before performing useful
work...  You can measure such an optimization in a worthwhile amount of $
or carbon footprint saved around the world.  Heat death of the universe by
a billion cuts.  Thanks for working on this!

-G
___
Python-Dev mailing list

Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Larry Hastings



On 09/19/2018 03:08 PM, Terry Reedy wrote:
If Python usually used derived stdlib code, but could optionally use 
the original .py files via a command-line switch, experimenting with 
changes to .py files would be easier.


When Carl described the patch to me, he said there was already a switch 
in there somewhere to do exactly that.  I don't remember if it was 
command-line, it might have been an environment variable.  (I admit I 
didn't go hunting for it--I didn't need it to test the patch itself, and 
I had enough to do.)  Regardless, we would definitely have that 
functionality in before the patch would ever be considered for merging.



We talked about it last week at the core dev sprint, and I thought about 
it some more.  As a result here's the behavior I propose.  I'm going to 
call the process "freezing" and the result "frozen modules", even though 
that's an already-well-overused name and I hope we'll pick something 
else before it gets merged.


   First, .py files that get frozen will have their date/time stamps
   set to a known value, both as part of the tarball / zip file, and
   when installed (a la "make install", the Win32 installer, etc). 
   There are some constraints on this; we distribute Python via .zip
   files, and .zip only supports 2 second resolution for date/time
   stamps.  So maybe something like this: the date is the approximate
   date of the release, and the time is the version number (e.g.
   03:08:00 for all 3.8.x releases).

   When attempting to load a frozen Python module, Python will stat the
   .py file.  If the date/time and size match what we expected, Python
   will use the frozen module.  Otherwise it'll fall back to
   conventional behavior, including supporting .pyc files.

   There will also be a switch (command-line? environment variable?
   compile-time flag? all three?) for people who control their
   environments where you can skip the .py file and use the frozen
   module every time.

In short: correctness by default, and more speed available if you know 
it's safe for your use case.  Use of the optimization is intentionally a 
little fragile, to ensure correctness.



Cheers,


//arry/

p.s. Why not 03:08:01 for 3.8.1?  That wouldn't be stored properly in 
the .zip file with its only-two-second resolution.  And multiplying the 
tertiary version number by 2--or 10, etc--would be surprising.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Terry Reedy

On 9/18/2018 2:38 PM, Steve Dower wrote:

The primary benefit of the importlib hook approach is that it would not 
require rebuilding CPython each time you make a change.


If one edits a .c or .h file, one must rebuild to test.  If one edits a 
.py module, one does not, and it would be a major nuisance to have to.


My first suggested patches on the tracker (to .py files) were developed 
in my installed version (after backing up a module).  I have 
occasionally told people on StackOverflow how to edit an idlelib file to 
get a future change 'now'.  Other people have occasional reported there 
own custom modifications.  If Python usually used derived stdlib code, 
but could optionally use the original .py files via a command-line 
switch, experimenting with changes to .py files would be easier.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Stefan Behnel
Carl Shapiro schrieb am 18.09.2018 um 22:44:
> How might people feel about using the linker to bundle a list of pre-loaded
> modules into a single-file executable?

One way to do that would be to compile Python modules with Cython and link
them in statically, instead of compiling them to .pyc files.

Advantage: you get native C .o files, fast and straight forward to link.

Disadvantage: native code is much more voluminous than byte code, so the
overall binary size would grow substantially.

Also, one thing that would be interesting to find out is whether constant
Python data structures can actually be pre-allocated in the data segment
(and I mean their object structs) . Then things like tuples of strings
(argument lists and what not) could be loaded and the objects quickly
initialised (although, is that even necessary?), rather than having to heap
allocate and create them. Probably something that we should try out in Cython.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Neil Schemenauer
On 2018-09-18, Carl Shapiro wrote:
> How might people feel about using the linker to bundle a list of pre-loaded
> modules into a single-file executable?

The users of Python are pretty diverse so it depends on who you ask.
Some would like a giant executable that includes everything they
need (so of like the Go approach).  Other people want an executable
that has just importlib inside it and then mix-and-match different
shared libs for their different purposes.  Some will not want work
"old school" and load from separate .py or .pyc files.

I see no reason why we can't support all these options.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Carl Shapiro
On Tue, Sep 18, 2018 at 11:38 AM, Steve Dower 
wrote:

> The primary benefit of the importlib hook approach is that it would not
> require rebuilding CPython each time you make a change. Since we need to
> consider a wide range of users across a wide range of platforms, having the
> ability to load a single native module that contains many "pre-loaded"
> modules allows many more people to access the benefits.
>

> It would not prevent some specific modules from being compiled into the
> main binary, but for those who do not build their own Python it would also
> allow specific applications to use the feature as well.
>

How might people feel about using the linker to bundle a list of pre-loaded
modules into a single-file executable?  That would avoid the inconvenience
of rebuilding all of CPython by shipping a static libpython and having the
tool generate a .o or .S file with the un-marshaled data.  (Linkers and
assemblers are small enough to be bundled on systems that do not have them.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Fabio Zadrozny
On Tue, Sep 18, 2018 at 2:57 PM, Carl Shapiro 
wrote:

> On Tue, Sep 18, 2018 at 5:55 AM, Fabio Zadrozny  wrote:
>
>> During the import process, Python can already deal with folders and .zip
>> files in sys.path... now, instead of having special handling for a new
>> concept with a custom command line, etc, why not just say that this is a
>> special file (e.g.: files with a .pyfrozen extension) and make importlib be
>> able to deal with it when it's on sys.path (that way there could be
>> multiple of those and there should be no need to turn it on/off, custom
>> command line, etc)?
>>
>
> That is an interesting idea but it might not be easy to work into this
> design.  The improvement in start-up time comes from eliminating the
> overheads of filesystem I/O, memory allocation, and un-marshaling
> bytecode.  Having this data on the filesystem would reintroduce the cost of
> filesystem I/O and it would add a load-time relocation to the equation so
> the overall performance benefits would be greatly lessened.
>
>
>> Another question: doesn't importlib already provide hooks for external
>> contributors which could address that use case? (so, this could initially
>> be available as a third party library for maturing outside of CPython and
>> then when it's deemed to be mature it could be integrated into CPython --
>> not that this can't happen on Python 3.8 timeframe, but it'd be useful
>> checking its use against the current Python version and measuring benefits
>> with real world code).
>>
>
> This may be possible but, for the same reasons I outline above, it would
> certainly come at the expense of performance.
>
> I think many people are interested in a better .pyc format but our goals
> are much more modest.  We are actually trying to not introduce a whole new
> way to externalize .py data in CPython.  Rather, we think of this as just
> making the existing frozen module capability much faster so its use can be
> broadened to making start-up performance better.  The user visible part,
> the command line interface to bypass the frozen module, would be a
> nice-to-have for developers but is something we could live without.
>

Just to make sure we're in the same page, the approach I'm talking about
would still be having a dll, not a better .pyc format, so, during the
import a custom importer would open that dll once and provide modules from
it -- do you think this would be much more overhead than what's proposed
now?

I guess it may be a bit slower because it'd have to obey the existing
import capabilities, but that shouldn't mean more time is spent on IO,
memory allocation nor un-marshaling bytecode (although it may be that I
misunderstood the approach or the current import capabilities don't provide
the proper api for that).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Steve Dower

On 18Sep2018 1057, Carl Shapiro wrote:
On Tue, Sep 18, 2018 at 5:55 AM, Fabio Zadrozny > wrote:


During the import process, Python can already deal with folders and
.zip files in sys.path... now, instead of having special handling
for a new concept with a custom command line, etc, why not just say
that this is a special file (e.g.: files with a .pyfrozen extension)
and make importlib be able to deal with it when it's on sys.path
(that way there could be multiple of those and there should be no
need to turn it on/off, custom command line, etc)?


That is an interesting idea but it might not be easy to work into this 
design.  The improvement in start-up time comes from eliminating the 
overheads of filesystem I/O, memory allocation, and un-marshaling 
bytecode.  Having this data on the filesystem would reintroduce the cost 
of filesystem I/O and it would add a load-time relocation to the 
equation so the overall performance benefits would be greatly lessened.


Another question: doesn't importlib already provide hooks for
external contributors which could address that use case? (so, this
could initially be available as a third party library for maturing
outside of CPython and then when it's deemed to be mature it could
be integrated into CPython -- not that this can't happen on Python
3.8 timeframe, but it'd be useful checking its use against the
current Python version and measuring benefits with real world code).


This may be possible but, for the same reasons I outline above, it would 
certainly come at the expense of performance.


I think many people are interested in a better .pyc format but our goals 
are much more modest.  We are actually trying to not introduce a whole 
new way to externalize .py data in CPython.  Rather, we think of this as 
just making the existing frozen module capability much faster so its use 
can be broadened to making start-up performance better.  The user 
visible part, the command line interface to bypass the frozen module, 
would be a nice-to-have for developers but is something we could live 
without.


The primary benefit of the importlib hook approach is that it would not 
require rebuilding CPython each time you make a change. Since we need to 
consider a wide range of users across a wide range of platforms, having 
the ability to load a single native module that contains many 
"pre-loaded" modules allows many more people to access the benefits.


It would not prevent some specific modules from being compiled into the 
main binary, but for those who do not build their own Python it would 
also allow specific applications to use the feature as well.


FWIW, I don't read this as being pushed back on Carl to implement before 
the idea is accepted. I think we're taking the (now proven) core idea 
and shaping it into a suitable form for the main CPython distribution, 
which has to take more use cases into account.


Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Carl Shapiro
On Tue, Sep 18, 2018 at 5:55 AM, Fabio Zadrozny  wrote:

> During the import process, Python can already deal with folders and .zip
> files in sys.path... now, instead of having special handling for a new
> concept with a custom command line, etc, why not just say that this is a
> special file (e.g.: files with a .pyfrozen extension) and make importlib be
> able to deal with it when it's on sys.path (that way there could be
> multiple of those and there should be no need to turn it on/off, custom
> command line, etc)?
>

That is an interesting idea but it might not be easy to work into this
design.  The improvement in start-up time comes from eliminating the
overheads of filesystem I/O, memory allocation, and un-marshaling
bytecode.  Having this data on the filesystem would reintroduce the cost of
filesystem I/O and it would add a load-time relocation to the equation so
the overall performance benefits would be greatly lessened.


> Another question: doesn't importlib already provide hooks for external
> contributors which could address that use case? (so, this could initially
> be available as a third party library for maturing outside of CPython and
> then when it's deemed to be mature it could be integrated into CPython --
> not that this can't happen on Python 3.8 timeframe, but it'd be useful
> checking its use against the current Python version and measuring benefits
> with real world code).
>

This may be possible but, for the same reasons I outline above, it would
certainly come at the expense of performance.

I think many people are interested in a better .pyc format but our goals
are much more modest.  We are actually trying to not introduce a whole new
way to externalize .py data in CPython.  Rather, we think of this as just
making the existing frozen module capability much faster so its use can be
broadened to making start-up performance better.  The user visible part,
the command line interface to bypass the frozen module, would be a
nice-to-have for developers but is something we could live without.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Carl Shapiro
On Tue, Sep 18, 2018 at 1:31 AM, Antoine Pitrou  wrote:

> No idea.  In my previous experiments with module import speed, I
> concluded that executing module bytecode generally was the dominating
> contributor, but that doesn't mean loading bytecode is costless.
>

My observations might not be so different.  On a large application, we
measured ~25-30% of start-up time being spent in the loading of compiled
bytecode.  That includes: probing the filesystem, reading the bytecode off
disk, allocating heap storage, and un-marshaling objects into the heap.

Making that percentage go to ~0% using this change does not make the
non-import parts of our module body functions execute faster.  It does
create a greater opportunity for the application developer to do less work
in module body functions which is where the largest start-up time gains are
now likely to happen.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Fabio Zadrozny
On Mon, Sep 17, 2018 at 9:23 PM, Carl Shapiro 
wrote:

> On Sun, Sep 16, 2018 at 1:24 PM, Antoine Pitrou 
> wrote:
>
>> I think it's of limited interest if it only helps with modules used
>> during the startup sequence, not arbitrary stdlib or third-party
>> modules.
>>
>
> This should help any use-case that is already using the freeze module
> already bundled with CPython.  Third-party code, like py2exe, py2app,
> pyinstaller, and XAR could build upon this to create applications that
> start faster.
>

I think this seems like a great idea.

Some questions though:

During the import process, Python can already deal with folders and .zip
files in sys.path... now, instead of having special handling for a new
concept with a custom command line, etc, why not just say that this is a
special file (e.g.: files with a .pyfrozen extension) and make importlib be
able to deal with it when it's on sys.path (that way there could be
multiple of those and there should be no need to turn it on/off, custom
command line, etc)?

Another question: doesn't importlib already provide hooks for external
contributors which could address that use case? (so, this could initially
be available as a third party library for maturing outside of CPython and
then when it's deemed to be mature it could be integrated into CPython --
not that this can't happen on Python 3.8 timeframe, but it'd be useful
checking its use against the current Python version and measuring benefits
with real world code).

To give an idea, on my machine the baseline Python startup is about 20ms
>> (`time python -c pass`), but if I import Numpy it grows to 100ms, and
>> with Pandas it's more than 200ms.  Saving 4ms on the baseline startup
>> would make no practical difference for concrete usage.
>>
>
> Do you have a feeling for how many of those milliseconds are spend loading
> bytecode from disk?  If so standalone executables that contain numpy and
> pandas (and mercurial) would start faster
>
>
>> I'm ready to think there are other use cases where it matters, though.
>>
>
> I think so.  I hope you will, too :-)
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> fabiofz%40gmail.com
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-18 Thread Antoine Pitrou
On Mon, 17 Sep 2018 17:23:26 -0700
Carl Shapiro  wrote:
> 
> > To give an idea, on my machine the baseline Python startup is about 20ms
> > (`time python -c pass`), but if I import Numpy it grows to 100ms, and
> > with Pandas it's more than 200ms.  Saving 4ms on the baseline startup
> > would make no practical difference for concrete usage.
> >  
> 
> Do you have a feeling for how many of those milliseconds are spend loading
> bytecode from disk?

No idea.  In my previous experiments with module import speed, I
concluded that executing module bytecode generally was the dominating
contributor, but that doesn't mean loading bytecode is costless.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-17 Thread Carl Shapiro
On Sun, Sep 16, 2018 at 1:24 PM, Antoine Pitrou  wrote:

> I think it's of limited interest if it only helps with modules used
> during the startup sequence, not arbitrary stdlib or third-party
> modules.
>

This should help any use-case that is already using the freeze module
already bundled with CPython.  Third-party code, like py2exe, py2app,
pyinstaller, and XAR could build upon this to create applications that
start faster.


> To give an idea, on my machine the baseline Python startup is about 20ms
> (`time python -c pass`), but if I import Numpy it grows to 100ms, and
> with Pandas it's more than 200ms.  Saving 4ms on the baseline startup
> would make no practical difference for concrete usage.
>

Do you have a feeling for how many of those milliseconds are spend loading
bytecode from disk?  If so standalone executables that contain numpy and
pandas (and mercurial) would start faster


> I'm ready to think there are other use cases where it matters, though.
>

I think so.  I hope you will, too :-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-16 Thread Antoine Pitrou
On Fri, 14 Sep 2018 14:27:37 -0700
Larry Hastings  wrote:
> 
> I don't propose to merge the patch in its current state.  I think it 
> would need a lot of work both in terms of "doing things the way Python 
> does it" as well as just code smell (the serializer is implemented in 
> both C and Python and jumps back and forth, also the build process for 
> the serialized modules is pretty tiresome).
> 
> Is it worth working on?

I think it's of limited interest if it only helps with modules used
during the startup sequence, not arbitrary stdlib or third-party
modules.

To give an idea, on my machine the baseline Python startup is about 20ms
(`time python -c pass`), but if I import Numpy it grows to 100ms, and
with Pandas it's more than 200ms.  Saving 4ms on the baseline startup
would make no practical difference for concrete usage.

I'm ready to think there are other use cases where it matters, though.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-16 Thread Neil Schemenauer
On 2018-09-15, Paul Moore wrote:
> On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer  wrote:
> > We could have a new format, .pya (compiled python archive) that has
> > data for many .pyc files in it.
[..]
> Isn't that essentially what putting the stdlib in a zipfile does? (See
> the windows embedded distribution for an example). It probably uses
> normal IO rather than mmap, but maybe adding a "use mmap" flag to the
> zipfile module would be a more general enhancement that zipimport
> could use for free.

Yeah, it's close to the same thing.  If the syscalls are what gives
the speedup, using a better zipfile implementation might give nearly
the same benefit.

At the sprint we dicussed a variation of Larry's (FB's) patch.
Allow the frozen data to be in DLLs as well as in the python
executable data segment.  So, importlib would be frozen into the
exe.  The standard library could become another DLL.  The user could
provide one or more DLLs that contains their app code and package
deps.  In general, I think there would only be two DLLs: stdlib and
app+deps.

My suggestion of a special format (similar to zipfile) was
motivated by the wish to avoid platform build tools.  E.g. Windows
users would have a harder time to build DLLs.  However, I now think
depending on platform build tools is fine.  The people who will
build these DLLs will have the tools and skills to do so.  Even if
there is only a DLLs for the stdlib, it will be a win.  If no DLLs
are provided, you get the same behavior as current Python (i.e.
importlib is frozen in, everything else can come from .py files).

I think there is no question that Larry's PR will be faster than the
zipfile approach.  It removes the umarshal step.  Maybe that benefit
will but small but I think it should count.  Also, I suspect the OS
can page-in the DLL on-demand and perhaps leave parts of module .pyc
data on disk.  Larry had the idea of keeping code objects frozen
until they need to be executed.  It's a cool idea that would be
enabled by this first step.

I'm excited about Larry's PR.  I think if we get it cleanup up and
into Python 3.8, we will clearly leave Python 2.7 behind in terms of
startup performance.  That has been a goal of mine for a couple
years now.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-15 Thread Paul Moore
On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer  wrote:
>
> On 2018-09-14, Larry Hastings wrote:
> > [..] adding the stat calls back in costs you half the startup.  So
> > any mechanism where we're talking to the disk _at all_ simply
> > isn't going to be as fast.
>
> Okay, so if we use hundreds of small .pyc files scattered all over
> the disk, that's bad?  Who would have thunk it. ;-P
>
> We could have a new format, .pya (compiled python archive) that has
> data for many .pyc files in it.  In normal runs you would have one
> or just and handlful of these things (e.g. one for stdlib, one for
> your app and all the packages it uses).  Then you mmap these just
> once and rely on OS page faults to bring in the data as you need it.
> The .pya would have a hash table at the start or end that tells you
> the offset for each module.

Isn't that essentially what putting the stdlib in a zipfile does? (See
the windows embedded distribution for an example). It probably uses
normal IO rather than mmap, but maybe adding a "use mmap" flag to the
zipfile module would be a more general enhancement that zipimport
could use for free.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-14 Thread Neil Schemenauer
On 2018-09-14, Larry Hastings wrote:
> [..] adding the stat calls back in costs you half the startup.  So
> any mechanism where we're talking to the disk _at all_ simply
> isn't going to be as fast.

Okay, so if we use hundreds of small .pyc files scattered all over
the disk, that's bad?  Who would have thunk it. ;-P

We could have a new format, .pya (compiled python archive) that has
data for many .pyc files in it.  In normal runs you would have one
or just and handlful of these things (e.g. one for stdlib, one for
your app and all the packages it uses).  Then you mmap these just
once and rely on OS page faults to bring in the data as you need it.
The .pya would have a hash table at the start or end that tells you
the offset for each module.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-14 Thread Larry Hastings



On 09/14/2018 02:54 PM, Neil Schemenauer wrote:

On 2018-09-14, Larry Hastings wrote:
[...]

improvement 0.21242667903482038 %

I assume that should be 21.2 % othewise I recommend you abandon the
idea. ;-P


Yeah, that thing you said.



I wonder how much of the speedup relies on putting it in the data
segment (i.e. using linker/loader to essentially handle the
unmarshal).  What if you had a new marshal format that only needed a
light 2nd pass in order to fix up the data loaded from disk?


Some experimentation would be in order.  I can suggest that, based on 
conversation from Carl, that adding the stat calls back in costs you 
half the startup.  So any mechanism where we're talking to the disk _at 
all_ simply isn't going to be as fast.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-14 Thread Neil Schemenauer
On 2018-09-14, Larry Hastings wrote:
[...]
> improvement 0.21242667903482038 %

I assume that should be 21.2 % othewise I recommend you abandon the
idea. ;-P

> The downside of the patch: for these modules it ignores the Python files on
> disk--it doesn't even stat them.

Having a command-line/env var to turn this on/off would be an
acceptable fix, IMHO.  If I'm running Python a server, I don't need
to be editing .py modules and have them be recognized.  Maybe have
it turned off by default, at least at first.

> Is it worth working on?

I wonder how much of the speedup relies on putting it in the data
segment (i.e. using linker/loader to essentially handle the
unmarshal).  What if you had a new marshal format that only needed a
light 2nd pass in order to fix up the data loaded from disk?  Yuri
suggested looking at formats like Cap'n Proto.  If the cost of the
2nd pass was not bad, you wouldn't have to rely on the platform C
toolchain.  Instead we can write .pyc files that hold this data.

Then the speedup can work on all compiled Python modules, not just
the ones you go through the special process that links them into the
data segment.  I suppose that might mean that .pyc files become arch
specific.  Maybe that's okay.

As you said last night, there doesn't seem to be much low hanging
fruit around anymore.  So, 21% looks pretty decent.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com