Hi

SUMMARY: We're starting to discuss implementation. I'm going to focus on
what can be done, with only a few changes to the interpreter.

First consider this:
    >>> from sys import getrefcount as grc
    >>> def fn(obj): return grc(obj)

    >>> grc(fn.__code__), grc(fn.__code__.co_code)
    (2, 2)
    >>> fn(fn.__code__), fn(fn.__code__.co_code)
    (5, 4)
    >>> grc(fn.__code__), grc(fn.__code__.co_code)
    (2, 2)

    # This is the bytecode.
    >>> fn.__code__.co_code
    b't\x00|\x00\x83\x01S\x00'

What's happening here? While the interpreter executes the pure Python
function fn, it changes the refcount of both fn.__code__ and
fn.__code__.co_code. This is one of the problems we have to solve, to make
progress.

These refcounts are stored in the objects themselves. So unless the
interpreter is changed, these Python objects can't be stored in read-only
memory. We may have to change code and co_code objects also.

Let's focus on the bytecode, as it's the busiest and often largest part of
the code object. The (ordinary) code object has a field which is (a pointer
to) the co_code attribute, which is a Python bytes object. This is the
bytecode, as a Python object.

Let's instead give the C implementation of fn.__code__ TWO fields. The
first is a pointer, as usual, to the co_code attribute of the code object.
The second is a pointer to the raw data of the co_code object.

When the interpreter executes the code object, the second field tells the
interpreter where to start executing. (This might be why the refcount of
fn.__code__.co_code is incremented during the execution of fn.) The
interpreter doesn't even have to look at the first field.

If we want the raw bytecode of a code object to lie in read-only memory, it
is enough to set the second pointer to that location. In both cases, the
interpreter reads the memory location of the raw bytecode and executes
accordingly.

This leaves the problem of the first field. At present, it can only be a
bytes object. When the raw bytecode is in read-only memory, we need a
second sort of object. It's purpose is to 'do the right thing'.

Let's call this sort of object perma_bytes. It's like a bytes object,
except the data is stored elsewhere, in read-only permanent storage.

Aside: If the co_code attribute of a code object is ordinary bytes - not
perma_bytes -- then the two pointer addresses differ by a constant, namely
the size of the header of a Python bytes object.

Any Python language operation on perma_bytes is done by performing the same
Python operation on bytes, but on the raw data that is pointed to. (That
raw data had better still be there, otherwise chaos or worse will result.)

So what have we gained, and what have we lost.

LOST:
1. fn.code object bigger by the size of a pointer.
2. Added perma_bytes object.

GAINED:
1. Can store co_code data in read-only permanent storage.
2. Bytes on fn.__code__.co_code objects are slower.
3. perma_bytes might be useful elsewhere.

It may be possible to improve the outcome, by making more changes to the
interpreter. I don't see a way of getting a useful outcome, by making fewer.

Here's another way of looking at things. If all the refcounts were stored
in a single array, and the data stored elsewhere, the changing refcount
wouldn't be a problem. Using perma_bytes allows the refount and the data to
be stored at different locations, thereby avoiding the refcount problem!

I hope this is clear enough, and that it helps. And that it is correct.

I'll let T. S. Eliot have the last word:

    https://faculty.washington.edu/smcohen/453/NamingCats.html
    The Naming of Cats is a difficult matter,
    It isn’t just one of your holiday games;
    You may think at first I’m as mad as a hatter
    When I tell you, a cat must have THREE DIFFERENT NAMES.

We're giving the raw data TWO DIFFERENT POINTERS.

with best wishes

Jonathan
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3AN53F4XEM4BHD4QKX3VOUHXDBM6HEIP/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to