[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Jonathan Fine
Hi SUMMARY: We're starting to discuss implementation. I'm going to focus on what can be done, with only a few changes to the interpreter. First consider this: >>> from sys import getrefcount as grc >>> def fn(obj): return grc(obj) >>> grc(fn.__code__), grc(fn.__code__.co_code)

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Guido van Rossum
I like where this is going. It would be nice if certain constants could also be loaded from RO memory. On Mon, Jun 22, 2020 at 00:16 Inada Naoki wrote: > On Mon, Jun 22, 2020 at 12:00 AM Guido van Rossum > wrote: > > > > > > I believe this was what Greg Stein's idea here was about. (As well as

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread M.-A. Lemburg
If you want to proceed in this direction, it would be better to do some more research into current CPU architectures and then build a VM optimized byte code storage object, which is well aligned, fits into today's caches and improves locality. freeze.py could then write out this format as well,

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Inada Naoki
On Mon, Jun 22, 2020 at 8:27 PM Barry Scott wrote: > > * New code and pyc format > * pyc has "rodata" segment >* It can be copied into single memory block, or can be mmapped. > * co_code should be aligned at least 2 bytes. > > > Would higher alignment help? malloc is using 8 or 16 byte

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Barry Scott
Let's try again... > On 22 Jun 2020, at 08:15, Inada Naoki wrote: > > On Mon, Jun 22, 2020 at 12:00 AM Guido van Rossum wrote: >> >> >> I believe this was what Greg Stein's idea here was about. (As well as >> Jonathan Fine's in this thread?) But the current use of code objects makes >>

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Barry Scott
> On 22 Jun 2020, at 08:15, Inada Naoki wrote: > > On Mon, Jun 22, 2020 at 12:00 AM Guido van Rossum wrote: >> >> >> I believe this was what Greg Stein's idea here was about. (As well as >> Jonathan Fine's in this thread?) But the current use of code objects makes >> this hard. Perhaps

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Chris Angelico
On Mon, Jun 22, 2020 at 5:19 PM Inada Naoki wrote: > I think lightweight bytes-like object is better. My rough idea is: > > * New code and pyc format > * pyc has "rodata" segment > * It can be copied into single memory block, or can be mmapped. > * co_code should be aligned at least

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-22 Thread Inada Naoki
On Mon, Jun 22, 2020 at 12:00 AM Guido van Rossum wrote: > > > I believe this was what Greg Stein's idea here was about. (As well as > Jonathan Fine's in this thread?) But the current use of code objects makes > this hard. Perhaps the code objects could have a memoryview object to hold > the

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-21 Thread Guido van Rossum
On Sun, Jun 21, 2020 at 02:53 M.-A. Lemburg wrote: > On 21.06.2020 01:47, Guido van Rossum wrote: > > Hm, I remember Greg's free threading too, but that's not the idea I was > > trying to recall this time. There really was something about bytecode > > objects being loaded from a read-only

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-21 Thread M.-A. Lemburg
On 21.06.2020 01:47, Guido van Rossum wrote: > Hm, I remember Greg's free threading too, but that's not the idea I was > trying to recall this time. There really was something about bytecode > objects being loaded from a read-only segment to speed up code loading. > (Much quicker than

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-21 Thread Antoine Pitrou
On Sun, 21 Jun 2020 11:07:05 +0200 Antoine Pitrou wrote: > > There's no such thing as "the cache". There are usually several levels > of cache. L1 cache is closest to the CPU [...] ... Note by "closest to the CPU" I really mean "closest to the CPU core's execution units". Those caches are

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-21 Thread Antoine Pitrou
On Fri, 19 Jun 2020 20:30:03 +0100 Barry Scott wrote: > > > I know very little about how this works except a vague rule of thumb > > that in the 21st century memory locality is king. If you want code to be > > fast, keep it close together, not spread out. > > Remember that the caches are

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-20 Thread Jeethu Rao
On a related note, there was a patch that I’d written for Python 3.6 to store code objects in the read only segment of the interpreter binary for faster interpreter startup. I’d sent the patch to Larry Hastings, who graciously ported it to Python 3.8 and posted it on bpo[1]. - Jeethu [1]:

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-20 Thread Guido van Rossum
Hm, I remember Greg's free threading too, but that's not the idea I was trying to recall this time. There really was something about bytecode objects being loaded from a read-only segment to speed up code loading. (Much quicker than unmarshalling a .pyc file.) I don't think we ever got the details

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-20 Thread Jonathan Fine
Hi All Guido wrote: I remember vaguely that about two decades ago Greg Stein hatched an idea > for code objects loaded from a read-only segment in shared libraries. > [Thank you for this, Guido. Your memory is good.] Here's a thread from 2009, where Guido said: Greg Stein reached this same

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Guido van Rossum
I remember vaguely that about two decades ago Greg Stein hatched an idea for code objects loaded from a read-only segment in shared libraries. I believe we went as far as ensuring that the interpreter could read bytecode from other things that strings, and I vaguely recall seeing a design for a

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Steven D'Aprano
On Fri, Jun 19, 2020 at 09:36:24AM +0100, Jonathan Fine wrote: > What I did not say explicitly, or not clearly enough, was that the previous > use would continue unchanged. The only change would be that a function > object would have a flag, which would tell the interpreter whether the >

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Greg Ewing
On 20/06/20 1:15 pm, Steven D'Aprano wrote: Here is some evidence that cache misses makes a real difference for performance. A 70% slow down on calling functions, due to an increase in L1 cache misses: https://bugs.python.org/issue28618 There's no doubt that cache misses are a big issue for

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Steven D'Aprano
On Thu, Jun 18, 2020 at 09:30:30PM -0400, Jonathan Goble wrote: > With that said, your proposal is unclear to me on whether this would force > immutability on all code objects (and thereby prevent all bytecode > modification), or whether it would have an opt-out (or opt-in) mechanism. Code

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Steven D'Aprano
On Fri, Jun 19, 2020 at 06:33:59PM +1200, Greg Ewing wrote: > On 19/06/20 9:28 am, Steven D'Aprano wrote: > >I know very little about how this works except a vague rule of thumb > >that in the 21st century memory locality is king. If you want code to be > >fast, keep it close together, not spread

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Barry Scott
> On 18 Jun 2020, at 22:28, Steven D'Aprano wrote: > > On Thu, Jun 18, 2020 at 06:49:13PM +0100, Barry Scott wrote: > >> The key part of the idea is that the memory holding the ref count is >> not adjacent to the memory holding the objects state. Further that >> rarely modified state

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Jonathan Fine
Hi Greg You wrote: On 19/06/20 9:28 am, Steven D'Aprano wrote: > > I know very little about how this works except a vague rule of thumb > > that in the 21st century memory locality is king. If you want code to be > > fast, keep it close together, not spread out. > > Python objects are already

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Jonathan Fine
Hi Richard Thank you for your interest. You wrote: One thought that I had is the fact that this whole proposal seems to be > based on code blocks never needing to be collected? > That's not quite what I meant to say. One part of the basic idea is that permanent code objects be made available to

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-19 Thread Greg Ewing
On 19/06/20 9:28 am, Steven D'Aprano wrote: I know very little about how this works except a vague rule of thumb that in the 21st century memory locality is king. If you want code to be fast, keep it close together, not spread out. Python objects are already scattered all over memory, and a

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Richard Damon
One thought that I had is the fact that this whole proposal seems to be based on code blocks never needing to be collected? given the program: def fun1(v):     return v def fun2(v)     return v+1 fun1 = fun2 The function code block that was originally bound to the name fun1 should now

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Jonathan Goble
On Thu, Jun 18, 2020 at 5:36 AM Jonathan Fine wrote: > Python allows the user to replace fn.__code__ by a different code object. > This is a rarely done dirty trick. > A dirty trick to you maybe, but occasionally useful. For example, it can be used to implement goto:

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Steven D'Aprano
On Thu, Jun 18, 2020 at 06:49:13PM +0100, Barry Scott wrote: > The key part of the idea is that the memory holding the ref count is > not adjacent to the memory holding the objects state. Further that > rarely modified state should be kept away from usually modified state. Isn't that going to

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Jonathan Fine
Hi Antoine Thank you for your interest. You wrote: I think you forgot the all-important parts: > 1) How does it work technically? > 2) What performance gain on which benchmark? In my original post I wrote: It might be helpful, after checking the analysis and before coding, to do > some simple

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Antoine Pitrou
Hello, I think you forgot the all-important parts: 1) How does it work technically? 2) What performance gain on which benchmark? Regards Antoine. On Thu, 18 Jun 2020 10:36:11 +0100 Jonathan Fine wrote: > Hi All > > Summary: Shared objects in Unix are a major influence. This proposal can

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Barry Scott
> On 18 Jun 2020, at 19:30, Jonathan Fine wrote: > > Hi Barry > > You wrote: > > We need to define terms here. What do you mean by permanent? > > Good question. I think I answered it in my original post: > > An object is transient if it can be garbage collected. An object is permanent >

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Jonathan Fine
Hi Barry You wrote: We need to define terms here. What do you mean by permanent? > Good question. I think I answered it in my original post: An object is transient if it can be garbage collected. An object is > permanent if it will never be garbage collected. You also wrote: > Being able

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Barry Scott
> On 18 Jun 2020, at 19:00, Jonathan Fine wrote: > > Hi Barry > > You wrote: > Did my last reply cover a possible implementation of this? > e.g. The code is nowhere near the ref-count that triggers COW. > > Could say, do you think it's possible to extend Python so that it can use >

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Jonathan Fine
Hi Barry You wrote: > Did my last reply cover a possible implementation of this? > e.g. The code is nowhere near the ref-count that triggers COW. > Could say, do you think it's possible to extend Python so that it can use permanent code objects, when they are made available? For the moment,

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Barry Scott
> On 18 Jun 2020, at 18:42, Jonathan Fine wrote: > > Hi Barry > > Thank you for your interest in my proposal. Let me try to answer your > question. You wrote: > > To make the code avoid COW you would need to be able to make sure that all > code memory blocks are not mixed in with PyObject

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Barry Scott
> On 18 Jun 2020, at 18:37, Christopher Barker wrote: > > On Thu, Jun 18, 2020 at 9:34 AM Barry Scott > wrote: > To make the code avoid COW you would need to be able to make sure that all > code memory blocks are not mixed in with PyObject memory blocks. > >

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Jonathan Fine
Hi Barry Thank you for your interest in my proposal. Let me try to answer your question. You wrote: To make the code avoid COW you would need to be able to make sure that all > code memory blocks are not mixed in with PyObject memory blocks. > > Then the ref count dance will have trigger COW

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Christopher Barker
On Thu, Jun 18, 2020 at 9:34 AM Barry Scott wrote: > To make the code avoid COW you would need to be able to make sure that all > code memory blocks are not mixed in with PyObject memory blocks. > > Then the ref count dance will have trigger COW for the code. > indeed. cPython already has its

[Python-ideas] Re: Permanent code objects (less memory, quicker load, less Unix Copy On Write)

2020-06-18 Thread Barry Scott
> On 18 Jun 2020, at 10:36, Jonathan Fine wrote: > > Hi All > > Summary: Shared objects in Unix are a major influence. This proposal can be > seen as a first step towards packaging pure Python modules as Unix shared > objects. > > First, there's a high level overview. Then some technical