[Python-ideas] Re: Using explicit parenthesization to convey aspects of semantic meaning?

2020-12-13 Thread David Mertz
On Sun, Dec 13, 2020, 5:11 PM Paul Sokolovsky d

> a + b + c   vs   a + (b + c)
>
> Here, there's even no guarantee of the same result, if we have user
> objects with weirdly overloaded __add__().
>

0.1 + 0.2 + 0.3 != 0.1 + (0.2 + 0.3)

>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/F7NE7HPSKZUHTYLDM7OTZBUYHCDZDD7Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Using explicit parenthesization to convey aspects of semantic meaning?

2020-12-13 Thread Chris Angelico
On Mon, Dec 14, 2020 at 5:57 PM Paul Sokolovsky  wrote:
>
> But that's what the question was about, and why there was the intro!
> Let's please go over it again. Do you agree with the following:
>
> a + (b + c)  <=>  t = b + c; a + t
>
> ?
>
> Where "<=>" is the equivalence operator. I do hope you agree, because
> it's both basis for evaluation implementation and for refactoring
> rules, and the latter is especially important for line-oriented
> language like Python, where wrapping expression across lines requires
> explicit syntactic markers, which some people consider ugly, so there
> should be clear rules for splitting long expressions which don't affect
> there semantic.

It really depends on what you mean by "equivalent". For instance, I'm
sure YOU will agree that they have the semantic difference of causing
an assignment to the name 't'. Additionally, Python will evaluate a
before b and c in the first example, but must evaluate b and c, add
them together, and only after that evaluate a. So, no, they aren't
entirely equivalent. Obviously, in many situations, the programmer
will know what's functionally equivalent, but the interpreter can't.

Clarify what you mean by equivalence and I will be able to tell you
whether I agree or not. (It's okay if your definition of equivalent
can't actually be described in terms of actual Python code, just as
long as you can explain which differences matter and which don't.)

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AV6WOKO6NFIDI63FMTVMEF7UOEFHI7LH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Using explicit parenthesization to convey aspects of semantic meaning?

2020-12-13 Thread Paul Sokolovsky
Hello,

On Mon, 14 Dec 2020 09:37:42 +1100
Chris Angelico  wrote:

> >   2   8 LOAD_NAME0 (obj)
> >  10 LOAD_METHOD  1 (meth)
...
> >
> >   3  16 LOAD_NAME0 (obj)
> >  18 LOAD_ATTR1 (meth)
...

> Creating bound method objects can be expensive. Python has a history
> of noticing ways to improve performance without changing semantics,
> and implementing them. Details here:
> 
> https://docs.python.org/3/library/dis.html#opcode-LOAD_METHOD

Thanks for the response. And I know all that. LOAD_METHOD/CALL_METHOD
was there in MicroPython right from the start. Like, the very first
commit to the project in 2013 already had it:
https://github.com/micropython/micropython/commit/429d71943d6b94c7dc3c40a39ff1a09742c77dc2#diff-ce2e872144bcba2e9a4c2e84c748fa6f3bba4f08406b27e69a88415944bf0c0eR108

> If you force the bound method object to be created (by putting it in a
> variable),

But that's what the question was about, and why there was the intro!
Let's please go over it again. Do you agree with the following:

a + (b + c)  <=>  t = b + c; a + t

?

Where "<=>" is the equivalence operator. I do hope you agree, because
it's both basis for evaluation implementation and for refactoring
rules, and the latter is especially important for line-oriented
language like Python, where wrapping expression across lines requires
explicit syntactic markers, which some people consider ugly, so there
should be clear rules for splitting long expressions which don't affect
there semantic.

So ok, if you agree with the above, do you agree with the below:

(a.b)()  <=>  t = a.b; t()

?

And I really wonder what depth of non-monotonic logic we can reach on
trying to disagree with the above ;-).

Python does have cases where syntactic refactoring is not possible. The
most infamous example is super() (Which reminds that, when args to it
were made optional, it would have been much better to make it just
"super.", there would be much less desire to "refactor" it). But the
more such places a language has, the less regular, hard to learn,
reason about, and optimize the language is. And poorer designed too. So,
any language with aspiration to not be called words should avoid such
cases. And then again, what can we tell about:
"(a.b)()  <=>  t = a.b; t()"

[]

> This is why lots of us are unimpressed by your strict mode - CPython
> is perfectly capable of optimizing the common cases without changing
> the semantics, so why change the semantics? :)

But please remember that you're talking with someone who takes
LOAD_METHOD for granted, from 2013. And who takes inline caches for
granted from 2015. So, what what would be the reason to take all that
for granted and still proceeding with the strict mode? Oh, the reasons
are obvious: a) it's the natural extension of the above; b) it allows
to reach much deeper (straight to the machine code, again), and by much
cheaper means (machine code for call will contain the same as in C,
no 10x times more code in guards).


For comparison, CPython added LOAD_METHOD in 2016. And lookup caching
started to be added in 2019. And it took almost 1.5 years to extend
caching from a single opcode to 2nd one. 1.5 years, Chris!

commit 91234a16367b56ca03ee289f7c03a34d4cfec4c8
Date:   Mon Jun 3 21:30:58 2019 +0900
bpo-26219: per opcode cache for LOAD_GLOBAL (GH-12884)

commit 109826c8508dd02e06ae0f1784f1d202495a8680
Date:   Tue Oct 20 06:22:44 2020 +0100
bpo-42093: Add opcode cache for LOAD_ATTR (GH-22803)

And 3rd one, LOAD_NAME, isn't covered, and it's easy to see why: instead
of using best-practice uniform inline caches, desired-to-be-better
Python semantics spawned the following monsters:

co->co_opcache_map = (unsigned char *)PyMem_Calloc(co_size, 1);

typedef struct {
PyObject *ptr;  /* Cached pointer (borrowed reference) */
uint64_t globals_ver;  /* ma_version of global dict */
uint64_t builtins_ver; /* ma_version of builtin dict */
} _PyOpcache_LoadGlobal;


All that stuff sits in your L1 cache, thrashes something else in and
out all the time, and makes it all still slow, slow, slow. "Perfectly
capable" you say? Heh.



-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NXDYVHKKMBL5EVXE4B5PORAPV3XHXRGW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Using explicit parenthesization to convey aspects of semantic meaning?

2020-12-13 Thread Chris Angelico
On Mon, Dec 14, 2020 at 9:11 AM Paul Sokolovsky  wrote:
> What would be the explanation for all that?
>
>
> For reference, the disassembly of the 3 lines with CPython3.7 is
> provided:
>
>   1   0 LOAD_NAME0 (obj)
>   2 LOAD_METHOD  1 (meth)
>   4 CALL_METHOD  0
>   6 POP_TOP
>
>   2   8 LOAD_NAME0 (obj)
>  10 LOAD_METHOD  1 (meth)
>  12 CALL_METHOD  0
>  14 POP_TOP
>
>   3  16 LOAD_NAME0 (obj)
>  18 LOAD_ATTR1 (meth)
>  20 STORE_NAME   2 (t)
>  22 LOAD_NAME2 (t)
>  24 CALL_FUNCTION0
>  26 POP_TOP
> ...

Creating bound method objects can be expensive. Python has a history
of noticing ways to improve performance without changing semantics,
and implementing them. Details here:

https://docs.python.org/3/library/dis.html#opcode-LOAD_METHOD

If you force the bound method object to be created (by putting it in a
variable), the semantics should be the same, but performance will be
lower. Consider:

rosuav@sikorsky:~$ python3.10 -c 'import dis; dis.dis(lambda obj:
(obj.meth,)[0]())'
  1   0 LOAD_FAST0 (obj)
  2 LOAD_ATTR0 (meth)
  4 BUILD_TUPLE  1
  6 LOAD_CONST   1 (0)
  8 BINARY_SUBSCR
 10 CALL_FUNCTION0
 12 RETURN_VALUE
rosuav@sikorsky:~$ python3.10 -c 'import dis; dis.dis(lambda obj:
(obj.meth(),)[0])'
  1   0 LOAD_FAST0 (obj)
  2 LOAD_METHOD  0 (meth)
  4 CALL_METHOD  0
  6 BUILD_TUPLE  1
  8 LOAD_CONST   1 (0)
 10 BINARY_SUBSCR
 12 RETURN_VALUE

rosuav@sikorsky:~$ python3.10 -m timeit -s 'x, f = 142857, lambda obj:
(obj.bit_length(),)[0]' 'f(x)'
200 loops, best of 5: 101 nsec per loop
rosuav@sikorsky:~$ python3.10 -m timeit -s 'x, f = 142857, lambda obj:
(obj.bit_length,)[0]()' 'f(x)'
200 loops, best of 5: 124 nsec per loop

rosuav@sikorsky:~$ python3.6 -m timeit -s 'x, f = 142857, lambda obj:
(obj.bit_length(),)[0]' 'f(x)'
1000 loops, best of 3: 0.124 usec per loop
rosuav@sikorsky:~$ python3.6 -m timeit -s 'x, f = 142857, lambda obj:
(obj.bit_length,)[0]()' 'f(x)'
1000 loops, best of 3: 0.123 usec per loop

Measurable improvement in 3.10, indistinguishable in 3.6.

This is why lots of us are unimpressed by your strict mode - CPython
is perfectly capable of optimizing the common cases without changing
the semantics, so why change the semantics? :)

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5KBT5DI6NKKLS4QEDAYNIHKOFBGL7PFP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Using explicit parenthesization to convey aspects of semantic meaning?

2020-12-13 Thread Paul Sokolovsky
Hello,

How would you feel if explicit parens were used to convey additional
semantic meaning? That seems like a pretty dumb question, because,
well, parens *are* used to convey additional semantic meaning. E.g.:

1 + 2 + 3   vs   1 + (2 + 3)

The result is the same, but somehow I wanted to emphasize that 2 and 3
should be added together, and somehow else.

a + b + c   vs   a + (b + c)

Here, there's even no guarantee of the same result, if we have user
objects with weirdly overloaded __add__().

Thanks for hanging with me so far, we're getting to the crux of the
question:

Do you think there can be difference between the following two
expressions:

obj.meth()
(obj.meth)()

?

The question is definitely with a trick (why else there would be the
intro), and first answer which comes to mind might not be the right
one. As a hint, to try to get a grounded answer to that question, it
would be useful to look at the difference in disassembly of the above
code in CPython3.6 vs CPython3.7 (or later):

python3.6 -m dis meth_call.py
python3.7 -m dis meth_call.py

Then, to try to explain the difference at the suitable level of
abstraction. If that doesn't provide enough differentiation, it might
be helpful to add the 3rd line:

t = obj.meth; t()

And run all 3 lines thru CPython3.7, and see if the pattern is now
visible, and a distortion in the pattern too. 

What would be the explanation for all that? 


For reference, the disassembly of the 3 lines with CPython3.7 is
provided:

  1   0 LOAD_NAME0 (obj)
  2 LOAD_METHOD  1 (meth)
  4 CALL_METHOD  0
  6 POP_TOP

  2   8 LOAD_NAME0 (obj)
 10 LOAD_METHOD  1 (meth)
 12 CALL_METHOD  0
 14 POP_TOP

  3  16 LOAD_NAME0 (obj)
 18 LOAD_ATTR1 (meth)
 20 STORE_NAME   2 (t)
 22 LOAD_NAME2 (t)
 24 CALL_FUNCTION0
 26 POP_TOP
...

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VGOUM3QYUIT4F2WDUJ224IDZ2MSQJD3E/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: a new data type: NamedValue -- similar to Enum

2020-12-13 Thread Ethan Furman

On 12/12/20 7:25 PM, Steven D'Aprano wrote:
> On Sat, Dec 12, 2020 at 06:00:17PM -0800, Ethan Furman wrote:

>> Enum is great!  Okay, okay, my opinion might be biased.  ;)
>>
>> There is one area where Enum is not great -- for a bunch of unrelated
>> values.
>
> I don't know how to interpret that. Surely *in practice* enums are
> always going to be related in some sense?

I certainly hope so -- that was one of the points in creating Enum.

> I don't expect to create an enum class like this:
>
>  class BunchOfRandomStuff(Enum):
>  ANIMAL = 'Mustela nivalis (Least Weasel)'
>  PRIME = 503
>  EULER_MASCHERONI_CONSTANT = 0.5772156649015329
>  BEST_PICTURE_1984 = 'Amadeus'
>  DISTANCE_MELBOURNE_SYDNEY = (16040497, "rack mount units")

Which is why I said, "Enum is not great for a bunch of unrelated values".

> If I did create such an unusual collection of enums, what is the
> standard Enum lacking that makes it "not great"? It seems to work fine
> to me.

Lots of things work -- calling `__len__` instead of `len()` works, but `__len__` is not the best way to get the length 
of an object.


Enums are not great for a bunch of unrelated values because:

- duplicate values would all get aliased to one name
- ordinary values should not be compared using `is`
- standard Enums cannot be seamlessly used as their actual value (example in 
other email)

[...]

> I don't understand this. Are you suggesting that NamedValues will have a
> `type` attribute **like Enum**, or **in addition** to what Enum provides
> (value and name)?

To be honest, Enum may have a "type" attribute at this point, I don't remember.  NamedValues would definitely have a 
"type" attribute whose primary purpose is to make the value attribute work.


As an example, consider sre_constant.MAXREPEAT vs sre_constant.MAX_REPEAT (the only difference is the underscore -- took 
me a few moments to figure that out).


The sre_constant._NamedIntConstant class adds a name attribute, and returns 
that as the repr().
```
>>> sre_constants.MAXREPEAT
MAXREPEAT
>>> sre_constants.MAX_REPEAT
MAX_REPEAT
```

Not very illuminating.  I ended up getting the actual value by calling `int()` 
on them.

```
>>> int(sre_constants.MAXREPEAT)
4294967295
>>> int(sre_constants.MAX_REPEAT)
42
```

By adding a "type" attribute, getting something useful becomes a little easier:

```
@property
def value(self):
return self._type_(self)
```
or maybe
```
@property
def value(self):
return self._type_.__repr__(self)
```

>> unlike Enum, duplicates are allowed
>
> Um, do you mean duplicate names? How will that work?

No, duplicate values -- but in an Enum the names given to the duplicate value become aliases to the original name/value, 
while duplicates in NamedValue would remain different objects.


>> unlike Enum, new values can be added after class definition
>
> Is there a use-case for this?

Yes.

> If there is such a use-case, could we not just given Enums an API for
> adding new values, rather than invent a whole new Enum-by-another-name?

While NamedValues have a similarities to Enum (.name, .value, human readable 
repr()), they are not Enums.

>> unlike Enum, a NamedValue can always be used as-is, even if no data type
>> has been mixed in -- in other words, there is no functional difference
>> between MyIntConstants(int, NamedValue) and MyConstants(NamedValue).
>
> Sorry, I don't get that either. How can Enums not be used "as-is"? What
> does that mean?

It means that you can't do things with the actual value of Color.RED, whether that value is an int, a string, or a 
whatever, without going through the value attribute.


> Are you suggesting that NamedValue subclasses will automatically insert
> `int` into their MRO?

No, I'm saying that a NamedValue subclass will have int, or string, or frozenset, or whatever the actual values' types 
are, in their mro:


```
def __new__(cls, value, name):
actual_type = type(value)
new_value_type = type(cls.__name__, (cls, type(value)), {})
obj = actual_type.__new__(new_value_type, value)
obj._name_ = name
obj._type_ = actual_type
return obj
```
The subclasses are created on the fly.  The production code will cache the new 
subclasses so they're only created once.

>> If sre_constants was using a new data type, it should probably be IntEnum
>> instead.  But sre_parse is a good candidate for NamedValues:
>>
>>
>>  class K(NamedValues):
>>  DIGITS = frozenset("0123456789")
> [...snip additional name/value pairs...]
>
>> and in use:
>>
>>  >>> K.DIGITS
>>  K.DIGITS
>>  >>> K.DIGITS.name
>>  'DIGITS'
>>  >>> K.DIGITS.value
>>  frozenset("0123456789")
>
> Why not just use an Enum?

Why use a Counter instead of defaultdict instead of dict?  Because, depending on the task, one is more appropriate than 
the others.


--
~Ethan~

___
Python-ideas 

[Python-ideas] Re: tkinter Variable implementation

2020-12-13 Thread Ivo Shipkaliev
Yes, will do.

Regards
Ivo Shipkaliev


On Sun, 13 Dec 2020 at 15:40, Ronald Oussoren 
wrote:

> Could you file an issue about this on bugs.python.org?
>
> Ronald
>
> —
>
> Twitter / micro.blog: @ronaldoussoren
> Blog: https://blog.ronaldoussoren.net/
>
> On 13 Dec 2020, at 03:16, Ivo Shipkaliev  wrote:
>
> Hiya :)
>
> I'm recently working with tkinter and I noticed that:
>
> >>> import tkinter as tk
> >>> string_var = tk.StringVar()
>
> is throwing an AttributeError:
>
> > Traceback (most recent call last):
> >   . . .
> >   File "...\lib\tkinter\__init__.py", line 505, in __init__
> > Variable.__init__(self, master, value, name)
> >   File "...\lib\tkinter\__init__.py", line 335, in __init__
> > self._root = master._root()
> > AttributeError: 'NoneType' object has no attribute '_root'
>
> ... which is making me question myself:
> -- "Why am I looking for a '_root' attribute on a NoneType object here?",
> or
> -- "How did I end up with a NoneType, to be searched for a '_root'
> attribute in the first place?"
> I have done something wrong or I don't know something apparently.
>
> Following the traceback, looking at "tkinter\__init__.py", lines 333-335
> leads us to the implementation of StringVar's parent: class Variable:
>
> 333 > if not master:
> 334 > master = _default_root
> 335 > self._root = master._root()
>
> Class Variable constructor interface:
>
> 317 > def __init__(self, master=None, value=None, name=None):
>
> Clearly, on line 335, we are relying on the "master" to have a "_root"
> callable. If "master" was not declared upon class instantiation (line
> 317), a value can come from _default_root, which is a global module
> variable. _default_root is holding a Tk instance, if one was
> instantiated. So, if neither "master" nor _default_root was defined, line
> 335 will result in an AttributeError.
>
> Now, would it be a good idea if we keep the user better informed by making
> the aforementioned lines look something like?:
>
> 333 > if not master:
> 334 > master = _default_root
> 335 > if not master:
> 336 > raise RuntimeError(f"{self.__class__.__name__!r}:
> 'master' must be defined or a Tk instance must be present!")
> 337 > self._root = master._root()
>
> ... and not relying on this inexplicable AttributeError.
>
> Thank you for your time!
>
> Regards
> Ivo Shipkaliev
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/FSQUFJJQDNSRN4HI7VFXWCNO46YLXQDS/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6Y4JPTS2UZVTW27CRPNGQRQ3VAODHBPM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: tkinter Variable implementation

2020-12-13 Thread Ronald Oussoren via Python-ideas
Could you file an issue about this on bugs.python.org ?

Ronald

—

Twitter / micro.blog: @ronaldoussoren
Blog: https://blog.ronaldoussoren.net/

> On 13 Dec 2020, at 03:16, Ivo Shipkaliev  wrote:
> 
> Hiya :)
> 
> I'm recently working with tkinter and I noticed that:
> 
> >>> import tkinter as tk
> >>> string_var = tk.StringVar()
> 
> is throwing an AttributeError:
> 
> > Traceback (most recent call last):
> >   . . .
> >   File "...\lib\tkinter\__init__.py", line 505, in __init__
> > Variable.__init__(self, master, value, name)
> >   File "...\lib\tkinter\__init__.py", line 335, in __init__
> > self._root = master._root()
> > AttributeError: 'NoneType' object has no attribute '_root'
> 
> ... which is making me question myself:
> -- "Why am I looking for a '_root' attribute on a NoneType object here?", or
> -- "How did I end up with a NoneType, to be searched for a '_root' attribute 
> in the first place?"
> I have done something wrong or I don't know something apparently.
> 
> Following the traceback, looking at "tkinter\__init__.py", lines 333-335 
> leads us to the implementation of StringVar's parent: class Variable:
> 
> 333 > if not master:
> 334 > master = _default_root
> 335 > self._root = master._root()
> 
> Class Variable constructor interface:
> 
> 317 > def __init__(self, master=None, value=None, name=None):
> 
> Clearly, on line 335, we are relying on the "master" to have a "_root" 
> callable. If "master" was not declared upon class instantiation (line 317), a 
> value can come from _default_root, which is a global module variable. 
> _default_root is holding a Tk instance, if one was instantiated. So, if 
> neither "master" nor _default_root was defined, line 335 will result in an 
> AttributeError.
> 
> Now, would it be a good idea if we keep the user better informed by making 
> the aforementioned lines look something like?:
> 
> 333 > if not master:
> 334 > master = _default_root
> 335 > if not master:
> 336 > raise RuntimeError(f"{self.__class__.__name__!r}: 
> 'master' must be defined or a Tk instance must be present!")
> 337 > self._root = master._root()
> 
> ... and not relying on this inexplicable AttributeError.
> 
> Thank you for your time!
> 
> Regards
> Ivo Shipkaliev
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-ideas@python.org/message/FSQUFJJQDNSRN4HI7VFXWCNO46YLXQDS/
> Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LSMSO4VCE5NMFPZGHYFLCOVI25SGCC5A/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: tkinter Variable implementation

2020-12-13 Thread Ivo Shipkaliev
Thank you!

On Sun, 13 Dec 2020 at 03:57, William Pickard  wrote:

> Actually, the error message should read something like this: A valid
> instance of Tk is required.
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/EIEQ2PFV2MAEMQDZDU76KGCZFO2ZEI5P/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/53TXCLXOMLSCVWLYC23CEICVYW34HGL6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: [RFC] "Strict execution mode" (TL;DR version)

2020-12-13 Thread Paul Sokolovsky
Hello,

On Sun, 6 Dec 2020 00:37:15 -0500
Ricky Teachey  wrote:

[]

> From another who like CHB is just random person on this list (but
> probably even more "random"), interested enough to have read the
> entire thread and the other thread, but not knowledgeable or
> competent enough to offer detailed comments that are going to be
> particularly helpful to anyone, I'd say this:
> 
> If you could actually make a fully functioning python that is
> significantly faster by doing this, and it introduced this two-stage
> interpreter idea with a much more strict secondary stage, and did not
> require all kinds of additional syntax in the code to get the speed
> improvements (like cython for example), I'd think you really might
> have something that would help a lot of people see the benefits of a
> potential switch to a stricter paradigm of writing in an ostensibly
> dynamic language that nonetheless would now have to be written much
> more less dynamically when inside functions and class methods.
> 
> It seems to me that if the speed increase is enough, it could be
> worth the decrease in flexibility, potentially. At least enough to
> support the existence of a second mode of python execution (whether
> that mode lives in cpython or not doesn't seem to much matter to me).
> 
> However I think maybe a big big problem is probably going to be the
> lack of interest in very popular third party, and even standard,
> libraries to rewrite their code to fit a D1S2 (dynamic stage one,
> strict stage two) interpretation model. It seems likely many heavily
> used packages will simply be near totally broken for your strict
> interpreter,

That's exactly the interesting part, which would be interesting to
discuss, with interested parties. Just to give an idea of my timeline:

I coded the basic strict mode implementation for Pycopy in August last
year. Then made another pass over in November last year. Before merging
it to Pycopy mainline, I wanted to make sure it's viable with "general
Python code". That's why for winter holidays 2019/2020 I coded up
CPython pure-Python impl,
https://github.com/pfalcon/python-strict-mode. Of course, I faced issues
with CPython, went on to argue with CPython developers that they should
fix their stuff, and then suddenly winter holidays were over.

Fast forward to this November, I figure I'm not making progress. So I
think "god cares about CPython software, I care about *my* software".
I went to convert whatever codes I already had running in Pycopy to the
strict mode and found it's not bad at all (fixed gazillion usability
bugs with the strict mode, yeah).

The spec, I started to write it, because after such a delay, my first
reaction was literally the sane as @rosuav in his reply here on the
list: "Wait, wtf we don't support dynamic module imports, I ve
dynamic module imports." So, I had to remind me why, and write
it down this time. That's when open-source project get documentation -
when the authors themselves find a need for it ;-).

Bottom line, here's the biggest change I had to apply to my most
mind-boggling dynamic-imports app:
https://github.com/pfalcon/ScratchABlock/commit/ac2a9145ec8c05fe2be7c982d88a8abfa37609cc
The app allows to pass on the command line a dir name, which can be full
of files, and then inside each file, there can be multiple module names
to import. Whoa! Still, 25 lines to cover it.

To see whether it's much or not, would need to compare what it would
take me to do that in a static language. So, above I'm using Python as a
kind of DSL for my app. In a static language, I would need to write a
*real* DSL: all the lexer/parser/interpreter business. Not 25 lines at
all. And in Python, I can pay 25 lines price to get rid of the most
obnoxious Python misfeature comparing to a static language: blatantly
inefficient namespace lookups.


Again, I'd be only more interested to hear/see/tell more stories about
that. Just need to start somewhere.


> and many many others will need tweaking. So they will
> have to be rewitten, or at least tweaked, somehow. Maybe many could
> be rewritten automatically? I do not know. But I think you need to
> consider that you could get to the end of writing this thing and have
> it working perfectly with a major (10x? 50x?) speed improvement, and
> still have trouble getting people interested because you can't run
> code in, like, the enum or pathlib or functools library, or requests
> or numpy or something else. That would be a bummer. How do you see
> that problem getting solved?

I don't see much of a problem at all. I see it the same way as e.g.
Cython or Mypyc authors do: "to use this stuff, you need to change your
Python code".

So, what we need to compare is how much you need to change and what you
get in return. The strict mode asks for rather modest changes comparing
to the tools above. But neither it claims 10x-50x speed improvement.
Actually, the idea behind the strict mode is not to make Python faster.
It's to make Python 

[Python-ideas] Re: [RFC] "Strict execution mode" (TL;DR version)

2020-12-13 Thread Paul Sokolovsky
Hello,

On Sat, 5 Dec 2020 12:02:52 -0800
Christopher Barker  wrote:

> just one more note:
> 
> > > things like you are proposing with an eye to performance is not
> > > really where the Python community wants to go.  
> >
> > I never met a Python user who said something like "I want Python to
> > be slow" or "I want Python to keep being slow", so we'll see how
> > that goes. 
> 
> But many that might say "I don't want to make Python less flexible in
> order to gain performance"

And I'd shake hands with them, because I add "strict mode" as an
additional optional mode beyond the standard Python's mode. (I'd
however expect that I personally use it often, because it's just a
small notch above how I write programs in Python anyway.)

> Of course no one one is going to reject an enhancement that improves
> performance if it has no costs.
> 
> My thought on your idea is this:
> 
> Yes, a more restricted (strict) version of Python that had
> substantially better performance could be very nice. But the trick
> here is that you are proposing a spec, hoping that it could be used
> to enhance performance. I suspect you aren't going to get very far
> (with community support) without an implementation that shows what
> the performance benefits really are.

As I mentioned in previous replies, I fully agree that it would be nice
to see performance figures. But sadly, as directly related to the
strict mode, those aren't available yet. However, if the question is to
explicate the idea further, that can be done on synthetic examples
right away.

Suppose we have a pretty typically-looking Python code like (condensed
to save on vertical space):

---
def foo():
a = 1; b = 2
for _ in range(1000):
c = min(a, b)
foo()
---

The problem with executing that code is that "min" per the standard
Python semantics is looked up by name (and beyond that, the look up
is two-level, aka "pretty complex"). Done 10 mln types in a loop, that's
gotta be slow.

Let's run it in a Python implementation which doesn't have
existing means to optimize that "pretty complex" lookups, e.g. my
Pycopy (btw, am I the only one who finds it weird that you can't pass
a script to timeit?):

$ pycopy -m timeit -n1 -r1 "import case1"
1 loops, best of 1: 2.41 sec per loop

A common way to optimize global lookups (which are usually by name in
overdynamic languages) is to cache the looked up value in a local
variable (which aren't part of external interface, and thus are
usually already optimized to be accessed by "stack slot"):

---
def foo():
from builtins import min
a = 1; b = 2
for _ in range(1000):
c = min(a, b)
foo()
---

$ pycopy -m timeit -n1 -r1 "import case3"
1 loops, best of 1: 551 msec per loop

4 times faster.


So, the idea behind the strict mode is to be able to perform such an
optimization automatically, without manual patchings like "from
builtins import min" above. And the example above shows just the surface
of it, for bytecode interpretation cases. But the strict mode reaches
straight to the JITted machine code, where it allows to generate the
same code for function calls as it would for C.

The "code for function calls" is the keyword here. Of course, Python
differs from C in more things that just name lookups. And most of these
things are necessarily slower (and much harder to optimize). But the
name lookups don't have to be, and the strict mode (so far) tries to
improve just this one aspect. And it does that because it's simple to
do, for very modest losses in Python expressivity (adjusted for
real-world code sanity and maintainability).  And it again does that to
put a checkmark against it in move to the other things to optimize (or
not).

> I'm just one random guy on this list, but my response is:
> 
> "interesting, but show me how it works before you make anything
> official"

It's nothing "official", it's completely grass-roots proposal for
whoever may be interested in it. But I have to admit that I like it
very much (after converting a few of my apps to it), and already
treat it as unalienable part of the semantics of my Python dialect,
Pycopy.

> 
> -CHB
> 

[]

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LYBYET6FX3LOEAYYVJBULXWUHIVVJHXN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: a new data type: NamedValue -- similar to Enum

2020-12-13 Thread Steven D'Aprano
On Sun, Dec 13, 2020 at 12:34:27AM -0800, Ethan Furman wrote:

[me]
> class MyValue(int, Enum):
> >... ONE = 1
> >... TWO = 2
> >...
> MyValue.TWO + 3
> >5
> 
> It certainly can be abused for that, but the intended purpose of IntEnum is 
> not to support math operations, but rather to interoperate with existing 
> APIs that are using ints as magic numbers.

Sure, but "ints as magic numbers" are a bit of a special case. If 
TOPLEFT and BOTTOMRIGHT are magic numbers

(TOPLEFT + BOTTOMRIGHT)//2

is unlikely to make any semantic sense. But I only used int 
because you did :-) A better example, stolen from your earlier one:


class K(frozenset, Enum):
DIGITS = frozenset("0123456789")
LETTERS = frozenset(string.ascii_letters)


DIGITS | LETTERS makes perfect semantic sense. Tell me that's abuse, I 
double-dare you :-)

The bottom line here is that all of the functionality you suggest makes 
sense, but I'm not convinced that the right API is a new and independent 
data type unrelated to Enum. Everything you suggest sounds to me like a 
variation on Enum:

- a way to add new members to an Enum after creation

- a way to create duplicate Enum values that aren't aliases

- a way for enums to automatically inherit behaviour from their values,
  without explicitly mixing another subclass.

These feel like add-ons to the basic Enum data type, not a full-blown 
independent data type.

Or maybe it's just that I don't like the name NamedValue. (Too vague -- 
`x = 1` is a named value. Maybe if you called it FancyEnum or something 
I'd love it :-)


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T75R4PG34I6XJUHUB5UAKGTYC5TAHRGW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: a new data type: NamedValue -- similar to Enum

2020-12-13 Thread Ethan Furman

On 12/12/20 7:52 PM, Steven D'Aprano wrote:

On Sat, Dec 12, 2020 at 07:01:55PM -0800, Ethan Furman wrote:



That's invalid.  Duplicates allowed means:


```
class K( NamedValue):
  A = 1
  B = 1
```

B is not an alias for A.  Presumably one has the same number with different
meaning.  If that were an Enum:

```

K.B

K.A


Ah! Talking about Boy Looks, I had never noticed that behaviour before.
(But then I don't regularly use duplicate Enum values.)

What is the reason for that behaviour in Enums?


Different names for the same thing, the second names (and third, and ...) are aliases for the first.  For example, if 
you're me and constantly forget to drop the 's', you might have


class Border(Enum):
LINE = 'line'
LINES = 'line'

and then, no matter which I type, I get the right answer.  Silly example, but the point is that Enum members are 
singletons, and there should only be one canonical member to represent a single semantic value -- hence the 
recommendation to compare Enums using `is`.


NamedValues, on the other hand, don't have the concept of set membership and 
one canonical name for a single value.

>> [...]


Class MyEnum(Enum):
 ONE = 1
 TWO = 2

MyEnum.ONE + 3
# TypeError

Class MyValue(NamedValue):
 ONE = 1
 TWO = 2

MyValue.TWO + 3
5


Isn't this the solution to that?


class MyValue(int, Enum):

... ONE = 1
... TWO = 2
...

MyValue.TWO + 3

5


It certainly can be abused for that, but the intended purpose of IntEnum is not to support math operations, but rather 
to interoperate with existing APIs that are using ints as magic numbers.


--
~Ethan~
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2BTEWH4RFOWIWY7QICGH2ZQSFNG5JPTM/
Code of Conduct: http://python.org/psf/codeofconduct/