[Python-Dev] Re: Tagged pointer experiment: need help to optimize

2020-09-22 Thread Guido van Rossum
On Tue, Sep 22, 2020 at 4:58 PM Greg Ewing 
wrote:

> What are you trying to achieve by using tagged pointers?
>
> It seems to me that in a dynamic environment like Python, tagged
> pointer tricks are only ever going to reduce memory usage, not
> make anything faster, and in fact can only make things slower
> if applied everywhere.
>

Hm... mypyc (an experimental Python-to-C compiler bundled with mypy) uses
tagged pointers to encode integers up to 63 bits. I think it's done for
speed, and it's probably faster in part because it avoids slow memory
accesses. But (a) I don't think there's overflow checking, and (b) mypyc is
very careful that tagged integers are never passed to the CPython runtime
(since mypyc apps link with an unmodified CPython runtime for data types,
compatibility with extensions and pure Python code). Nevertheless I think
it puts your blanket claim in some perspective.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4W3MSE7F5SUVV3AZLATZN4YOT2XZSXYH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding why object defines rich comparison methods

2020-09-22 Thread Greg Ewing

On 23/09/20 12:20 am, Steven D'Aprano wrote:

Presumably back when rich comparisons were added, the choice would have
been:

- add one tp_richcompare slot to support all six methods; or

- add six slots, one for each individual dunder

in which case the first option wastes much less space.


I don't know the exact reasons, but it might also have been
because the implementations of the six dunders are usually
very closely related, so having just one function to implement
at the C level is a lot easier for most types.

Also remember that before tp_richcompare existed there was
only tp_compare, which also handled all the comparisons, so
tp_richcompare was likely seen as a generalisation of that.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SD2YPQSBV3MQ3GGNHVGRM3QM7WEBCTKG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Tagged pointer experiment: need help to optimize

2020-09-22 Thread Greg Ewing

On 22/09/20 10:06 pm, Victor Stinner wrote:

I wrote a simple implementation which leaves the code as it
is, but "unbox" tagged pointers when a function access directly object
members. Example in listobject.c:

 vl = (PyLongObject*)_Py_TAGPTR_UNBOX(v);
 wl = (PyLongObject*)_Py_TAGPTR_UNBOX(w);
 v0 = Py_SIZE(vl) == 0 ? 0 : (sdigit)vl->ob_digit[0];
 w0 = Py_SIZE(wl) == 0 ? 0 : (sdigit)wl->ob_digit[0];


I think you're using the terms "box" and "unbox" the opposite
way from most people. Usually a "boxed" type is one that lives
on the heap, and an "unboxed" one doesn't.


My first goal was to write a *correct* (working) implementation, not
really an optimized implementation.

That's why I'm calling for help to attempt to optimize it ;-)


What are you trying to achieve by using tagged pointers?

It seems to me that in a dynamic environment like Python, tagged
pointer tricks are only ever going to reduce memory usage, not
make anything faster, and in fact can only make things slower
if applied everywhere.

We already have ways of efficiently storing collections of ints
and floats -- array.array, numpy, etc. -- and you only pay for
the overhead of those if you need them.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QWWQA2U6M3JLQHQNCWJ4MROISHBDNW3I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enum and the Standard Library

2020-09-22 Thread Serhiy Storchaka
22.09.20 16:57, Ethan Furman пише:
> On 9/22/20 12:11 AM, Serhiy Storchaka wrote:
>> The only exception is StrEnum -- overriding __str__ of str
>> subclass may be not safe. Some code will call str() implicitly, other
>> will read the string content of the object directly, and they will be
>> different.
> 
> Following up on that:
> 
>     >>> import enum
>     >>>
>     >>> class TestStr(enum.StrEnum):
>     ... One = '1'
>     ... Two = '2'
>     ... Three = '3'
>     ...
>     >>> isinstance(TestStr.One, str)
>     True
> 
>     >>> str(TestStr.One)
>     'TestStr.One'
> 
>     >>> TestStr.One == '1'
>     True
> 
> I agree, str.__str__ needs to be used in this case.

It is more interesting to compare '%s' % (TestStr.One,) and
'{}'.format(TestStr.One). Also str.upper(TestStr.One) and
int(TestStr.One) ignore __str__.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z3UHCVPV6XQ4DP2QZAGNR4IYY565JG3D/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Tagged pointer experiment: need help to optimize

2020-09-22 Thread Mark Shannon

Hi Victor,

There are plenty of reasons for a new, cleaner C-API.
However, performance isn't really one of them.

The C-API is not much of an obstacle to improving the performance of
CPython, at the moment.

There are implementation details that leak into the API, but they are
only an issue when we want to change those details.
At which point, we should consider focused changes rather than the 
vague, sweeping changes suggested in PEP 620.



On 21/09/2020 7:35 pm, Victor Stinner wrote:

Hi,

I need to help to attempt to optimize my experimental CPython fork
which uses tagged pointers.

When I proposed my PEP 620 "Hide implementation details from the C
API", I was asked about a proof that the PEP unlocks real optimization
possibilities. So I wrote an implementation of tagged pointers:
https://github.com/vstinner/cpython/pull/6

The main benefit is the memory usage. For example, list(range(200))
uses 1656 bytes instead of 7262 (4x less memory).

Sadly, my current simple implementation is 1.1x slower than the
reference. I suspect that adding a condition to Py_INCREF() and
Py_DECREF() explains a large part of this overhead.

My implementation uses tagged pointers for:

* integers in the range: [-5; 256]
* None, True and False singletons

It would be nice to use tagged pointers for a wide range of integer
numbers, but I wrote a simple implementation: _Py_TAGPTR_UNBOX() has
to return a borrowed reference. This function should return a strong
reference to support a larger range.

More information in the document:
https://github.com/vstinner/cpython/blob/tagged_ptr/TAGGED_POINTERS.rst


A few suggestions:

Make the tagged value something like:
typedef struct {
   intptr_t bits;
} PyTaggedValue;

which will prevent erroneous casts to and from PyObject *.

Add new INCREF/DECREF inline functions that take tagged values.

Never return a borrowed reference (you should should know that ;)

Why are you tagging None, True and False? They don't take up any space.

Abstract out the tagging scheme. You will want to change it.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5L34BUFCP2BJXA6GITLXFJ6CCIIPQZTS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enum and the Standard Library

2020-09-22 Thread Ethan Furman

On 9/22/20 12:11 AM, Serhiy Storchaka wrote:


The only exception is StrEnum -- overriding __str__ of str
subclass may be not safe. Some code will call str() implicitly, other
will read the string content of the object directly, and they will be
different.


Following up on that:

>>> import enum
>>>
>>> class TestStr(enum.StrEnum):
... One = '1'
... Two = '2'
... Three = '3'
...
>>> isinstance(TestStr.One, str)
True

>>> str(TestStr.One)
'TestStr.One'

>>> TestStr.One == '1'
True

I agree, str.__str__ needs to be used in this case.

Thanks, Serhiy!

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A4ZHD3J47IKNNMQEJFIJQPMFBQH6HAZM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding why object defines rich comparison methods

2020-09-22 Thread Nick Coghlan
On Tue., 22 Sep. 2020, 10:25 pm Steven D'Aprano, 
wrote:

> On Tue, Sep 22, 2020 at 02:13:46PM +0300, Serhiy Storchaka wrote:
> > 22.09.20 12:48, Steven D'Aprano пише:
> > > Why does `object` define rich comparison dunders `__lt__` etc?
>
> > Because object.__eq__ and object.__ne__ exist. If you define slot
> > tp_richcompare in C, it is exposed as 6 methods __eq__, __ne__, __lt__,
> > __le__, __gt__ and __ge__. It is not possible to determine that __lt__()
> > always returns NotImplemented without running it.
>
> Ah, thank you Serhiy, I thought it might be something like that.
>
> Presumably back when rich comparisons were added, the choice would have
> been:
>
> - add one tp_richcompare slot to support all six methods; or
>
> - add six slots, one for each individual dunder
>
> in which case the first option wastes much less space. Is that
> a reasonable understanding of the motive?
>

Looking at https://www.python.org/dev/peps/pep-0207/ I suspect the
possibility of having 6 different slots didn't even come up (or if it did,
it wasn't considered seriously enough to be mentioned in the PEP).

Instead, the design was an evolution of the old boolean-only tp_compare
slot to allow NumPy to return arrays from array comparison operations.

Cheers,
Nick.


>
>
> --
> Steve
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/6HXL72CEF6GLDU5W2TC473NRT2KUQJSU/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A3NR4QSDAATBLF23RQBAKRNETVAOAIK4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding why object defines rich comparison methods

2020-09-22 Thread Steven D'Aprano
On Tue, Sep 22, 2020 at 02:13:46PM +0300, Serhiy Storchaka wrote:
> 22.09.20 12:48, Steven D'Aprano пише:
> > Why does `object` define rich comparison dunders `__lt__` etc?

> Because object.__eq__ and object.__ne__ exist. If you define slot
> tp_richcompare in C, it is exposed as 6 methods __eq__, __ne__, __lt__,
> __le__, __gt__ and __ge__. It is not possible to determine that __lt__()
> always returns NotImplemented without running it.

Ah, thank you Serhiy, I thought it might be something like that.

Presumably back when rich comparisons were added, the choice would have 
been:

- add one tp_richcompare slot to support all six methods; or

- add six slots, one for each individual dunder

in which case the first option wastes much less space. Is that 
a reasonable understanding of the motive?



-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6HXL72CEF6GLDU5W2TC473NRT2KUQJSU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding why object defines rich comparison methods

2020-09-22 Thread Serhiy Storchaka
22.09.20 12:48, Steven D'Aprano пише:
> Why does `object` define rich comparison dunders `__lt__` etc?
> 
> As far as I can tell, `object.__lt__` etc always return NotImplemented. 
> Merely inheriting from object isn't enough to have comparisons work. So 
> why do they exist at all? Other "do nothing" dunders such as `__add__` 
> aren't defined.

Because object.__eq__ and object.__ne__ exist. If you define slot
tp_richcompare in C, it is exposed as 6 methods __eq__, __ne__, __lt__,
__le__, __gt__ and __ge__. It is not possible to determine that __lt__()
always returns NotImplemented without running it.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NIZUOFVCTF4LEV2KA6VNRCHHYVCD2G3Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding why object defines rich comparison methods

2020-09-22 Thread Emily Bowman
NotImplemented is like a pure virtual function; failing to implement it
tells you that you forgot part of the contract, except at runtime instead
of compile time. So if you never need them, you're free to elide them, but
if you want full compatibility, you need to implement every part of it.

If someone tried to Obj + Obj and that's just completely nonsensical for
the type, then NotImplemented is a reasonable message, as opposed to
NameError which can be rather confusing. Alternately, you can implement it
and tell the caller exactly why it's not implemented. It's also useful for
anyone looking to subclass by immediately showing them every base method
they might need to implement.

-Em
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VPARRUAONYWMKELECKVAC4HP7AI23W2W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Tagged pointer experiment: need help to optimize

2020-09-22 Thread Victor Stinner
Le lun. 21 sept. 2020 à 21:46, Antoine Pitrou  a écrit :
> > The main benefit is the memory usage. For example, list(range(200))
> > uses 1656 bytes instead of 7262 (4x less memory).
>
> Hmm, how come? Aren't those tiny integers singletons already?

These numbers come from the code:

import sys
def size(l):
return sys.getsizeof(l) + sum(sys.getsizeof(item) for item in l)

print(size(list(range(200))), "bytes")

The code doesn't ignore the object size of singletons but count their
memory as if they are allocated on the heap memory.

It took a shortcut: I counted as if my implementation would support
any number, and not only existing Python singletons.

On my current implementation, list(range(200))) doesn't use more or
less memory, but the same amount.


> > Sadly, my current simple implementation is 1.1x slower than the
> > reference. I suspect that adding a condition to Py_INCREF() and
> > Py_DECREF() explains a large part of this overhead.
>
> And adding a condition in every place an object is inspected.  Even
> something as simple as Py_TYPE() is not a mere lookup anymore.

You're right that Py_TYPE() also gets a condition to check if the
pointer is tagged:
https://github.com/vstinner/cpython/blob/tagged_ptr/Include/object.h#L203

By the way, Py_INCREF() and Py_DECREF() use:

if (_Py_TAGPTR_IS_TAGGED(op)) {
// Tagged pointers are immutable
return;
}

https://github.com/vstinner/cpython/blob/tagged_ptr/Include/object.h#L497


> > It would be nice to use tagged pointers for a wide range of integer
> > numbers, but I wrote a simple implementation: _Py_TAGPTR_UNBOX() has
> > to return a borrowed reference. This function should return a strong
> > reference to support a larger range.
>
> Hmm, it sounds a bit weird.  The point of tagged pointers, normally, is
> to avoid creating objects at all.  If you create an object dynamically
> each time a tagged pointer is "dereferenced", then I suspect you won't
> gain anything.

Sure, functions could be optimized to avoid the creation of temporary
objects. I wrote a simple implementation which leaves the code as it
is, but "unbox" tagged pointers when a function access directly object
members. Example in listobject.c:

vl = (PyLongObject*)_Py_TAGPTR_UNBOX(v);
wl = (PyLongObject*)_Py_TAGPTR_UNBOX(w);
v0 = Py_SIZE(vl) == 0 ? 0 : (sdigit)vl->ob_digit[0];
w0 = Py_SIZE(wl) == 0 ? 0 : (sdigit)wl->ob_digit[0];

"vl->ob_digit" crashes if vl is a tagged pointer.

My first goal was to write a *correct* (working) implementation, not
really an optimized implementation.

That's why I'm calling for help to attempt to optimize it ;-)

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GQQ56I236ZUU3WOR5PS6N5WIEWNNOD47/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Understanding why object defines rich comparison methods

2020-09-22 Thread Steven D'Aprano
Why does `object` define rich comparison dunders `__lt__` etc?

As far as I can tell, `object.__lt__` etc always return NotImplemented. 
Merely inheriting from object isn't enough to have comparisons work. So 
why do they exist at all? Other "do nothing" dunders such as `__add__` 
aren't defined.

I've tried searching for answers, and looked at the source code for rich 
comparisons here:

https://github.com/python/cpython/blob/3.8/Objects/object.c

read the information here and in the PEP:

https://docs.python.org/3/c-api/object.html#c.PyObject_RichCompare

https://www.python.org/dev/peps/pep-0207/

I'm not sure if I'm looking in the right places, or if I even understand 
exactly what I'm looking at, but I think the relevent code is in 
typeobject.c, which defines a type:


PyTypeObject PyBaseObject_Type = {
PyVarObject_HEAD_INIT(_Type, 0)
"object",   /* tp_name */
...


and then sets tp_richcompare to `object_richcompare`. I guess that is 
enough to define the dunder methods?

Since (as far as I can tell) they don't do anything, why do they exist?


Thanks in advance.



-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TWL5VMCXC6SUEGRQ36CAMPA273E4O2TZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: unable to create PR on github

2020-09-22 Thread Antoine Pitrou
On Mon, 21 Sep 2020 20:56:18 -0700
Ethan Furman  wrote:
> And even more data:
> 
> I added a body to the PR I was originally having trouble with:
>button stayed gray
> 
> I went away for a while, say 5 - 10 minutes, and when I went back to 
> that screen the button was green.  I created the PR.

Recently, I had to add a body text to create a PR (on another repo),
otherwise the button would be grayed out.  So that seems to be the
reason.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SAT6OGCSQEEKU6Y47AECSNW7OS43RT4Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enum and the Standard Library

2020-09-22 Thread Serhiy Storchaka
19.09.20 00:44, Ethan Furman пише:
> I'm looking for arguments relating to:
> 
> - should _convert_ make the default __repr__ be module_name.member_name?

In most cases enums with _convert_ are used to replace old module
globals. They are accessible as module_name.member_name and always used
as module_name.member_name in user code. Also module_name.member_name is
usually shorter than module_name.class_name.member_name or
.

And the main advantage to me is using repr in compound objects:
"foo.Command(action=foo.READ, kind=foo.FILE)" can be copied just from
the debug output to the test code in contrary to
"foo.Command(action=, kind=>)" which needs a lot of editing (and I often
need to copy a list or a dict of such objects). I always override the
default __repr__ in production code.

> - should _convert_ make the default __str__ be the same, or be the
>   numeric value?

I do not think that exposing the numeric value in __str__ would be
useful. Numeric values are often arbitrary, this is why we use names at
first place. The only exception is StrEnum -- overriding __str__ of str
subclass may be not safe. Some code will call str() implicitly, other
will read the string content of the object directly, and they will be
different.

I would consider returning just the member name from __str__. It have
its pros and contras, so in the face of ambiguity it is better to
restore the default implementation: __str__ = object.__str__.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WZGSSQ5B6TAUSY5C2PPC7QTHRWBST6GJ/
Code of Conduct: http://python.org/psf/codeofconduct/