[Python-Dev] Re: [Python-ideas] Re: reversed enumerate

2020-04-01 Thread Andrew Barnert via Python-Dev
Before jumping in:

In many cases, when you want to reverse an enumerate, it’s small and 
fixed-sized, so there’s a trivial way to do this: Just store the enumerate 
iterator in a tuple, and tuples are reversible.

   for idx, value in reversed(tuple(enumerate(stuff))):

But of course there are some cases where this isn’t appropriate, like 
enumerating a fixed-size but huge input.

> On Apr 1, 2020, at 19:23, Steven D'Aprano  wrote:
> 
> [Ilya]
>> I needed reversed(enumerate(x: list)) in my code, and have discovered
>> that it wound't work. This is disappointing because operation is well
>> defined.
> 
> It isn't really well-defined, since enumerate can operate on infinite 
> iterators, and you cannot reverse an infinite stream.

...

> However, having said that, I think that your idea is not unreasonable. 
> `enumerate(it)` in the most general case isn't reversable, but if `it` 
> is reversable and sized, there's no reason why `enumerate(it)` shouldn't 
> be too.

Agreed—but this is just a small piece of a much wider issue. Today, enumerate 
is always an Iterator. It’s never reversible. But it’s also not sized, or 
subscriptable, or in-testable, even if you give it inputs that are. And it’s 
not just enumerate—the same is true for map, filter, zip, itertools.islice, 
itertools.dropwhile, etc.

There’s no reason these things couldn’t all be views, just like the existing 
dict views (and other things like memoryview and third-party things like numpy 
array slices). In fact, they already are in Swift, and will be in C++20. 

> My personal opinion is that this is a fairly obvious and straightforward 
> enhancement, one which (hopefully!) shouldn't require much, if any, 
> debate. I don't think we need a new class for this, I think enhancing 
> enumerate to be reversable if its underlying iterator is reversable 
> makes good sense.

Actually, that doesn’t work—it has to be Sized as well. 

More generally, it’s rarely _quite_ as simple as just “views support the same 
operations as the things they view”. An enumerate can be a Sequence if its 
input is, but a filter can’t. A map with multiple inputs isn’t Reversible 
unless they’re all not just Reversible but Sized, although a map with only one 
input doesn’t need it to be Sized. And so on. But none of these things are 
hard, it’s just a bunch of work to go through all the input types for all the 
view types and write up the rules. (Or steal them from another language or 
library that already did that work…)

> But if you can show some concrete use-cases, especially one or two from 
> the standard library, that would help your case. Or some other languages 
> which offer this functionality as standard.

Agreed. I don’t think we need to wait until someone designs and writes a 
complete viewtools library and submits it for stdlib inclusion before we can 
consider adding just one extension to one iterator. But I do think we want to 
add the one(s) that are most useful if any, not just whichever ones people 
think of first. I’ve personally wanted to reverse a map or a filter more often 
than an enumerate, but examples would easily convince me that that’s just me, 
and reversing enumerate is more needed.

> One potentially serious question: what should `enumerate.__reversed__` 
> do when given a starting value?
> 
>   reversed(enumerate('abc', 1))

I don’t think this is a problem. When you reversed(tuple(enumerate('abc', 1))) 
today, what do you get? You presumably don’t even need to look that up or try 
it out. It would be pretty confusing if it were different without the tuple.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CO6H6HLPNEN3Y53XJTS2QXQOZEZTABXM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 584: Add Union Operators To dict

2020-02-16 Thread Andrew Barnert via Python-Dev
Brandt Bucher wrote:

> This leads me to believe that we’re approaching the problem wrong. Rather 
> than making a
> copy and working on it, I think the problem would be better served by a 
> protocol that runs
> the default implementation, then calls some under hook on the subclass to 
> build a
> new instance.
>
> Let’s call this method `__build__`. I’m not sure what its arguments would
> look like, but it would probably need at least `self`, and an instance of the
> built-in base class (in this case a `float`), and return a new instance of the
> subclass based on the two. It would likely also need to work with `cls` 
> instead
> of `self` for `classmethod` constructors like
> `dict.fromkeys`, or have a second hook for that case.

You can call `self.fromkeys`, and it works just like calling 
`type(self).fromkeys`.

The only real advantage of having a second hook is that it would simplify the 
most trivial cases—which are very common. In particular, probably 90% of 
subclasses of builtins are like Steven's `MyFloat` example—all you really want 
to do is call your constructor in place of the super's constructor, and if you 
have to call it with the result of your super's constructor instead, that's 
fine because `MyFloat(x)` on a `float` or `MyFloat` is equivalent to `x` 
anyway. So you could just write `__build_cls__ = __new__` and you're done. With 
only an instance-method version, you'd have to write `def __build__(self, 
other): return type(self)(other)`. Which isn't _terrible_ or anything, but as 
boilerplate that has to be added (probably without being understood) to 
hundreds of classes, it's not exactly ideal.

If there were a way to actually get your constructor called on the `__new__` 
arguments directly, without constructing the superclass instance first, that 
would be even better. Besides being more efficient (and that "more efficient" 
could actually be a big deal, because we're talking about every call to every 
operator dunder and many other methods on builtin needing to check this in 
addition to whatever else it does…), it would allow a trivial implementation on 
types that share their super's constructor signature but can't guarantee that 
`MyType(x) == x`. Even for cases like `defaultdict`, if you could supply a 
constructor, you'd be fine: `partial(self, self.default_factory)` can be used 
with the arguments to a `dict` construction call just as easily as it can be 
used with a `dict` itself.

But I'm not sure there is such a way. (Maybe the pickle/copy protocol can help 
here? Not sure without thinking it through more…)

> If implemented right, a system like the one described above
> (__build__) wouldn’t be backward-incompatible, as long as nobody 
> was already using the name.

Assuming the builtins don't grow `__build__` methods that use `cls` or 
`type(self)` (which is what you'd ideally want, but then you get the same 
massive backward-incompatibility problem we were trying to avoid…), it seems 
like we're adding possibly significant cost to everything (maybe not 
significant for `dict.__union__`, but maybe so for `int.__add__`) for a benefit 
that almost no code actually uses. Maybe the longterm benefit of everyone being 
able to drop those `MyFloat(…)` calls all over once they can require 3.10+ is 
worth the immediate and permanent cost to performance and implementation 
complexity, but I'm not sure.

(If there were an opt-in way to replace the super's construction call instead 
of post-hooking it, the cost might be reduced enough to change that 
calculation. But again, I'm not sure if there is such a way.)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AN5BE7GKCEOTEXD5I4YFKVSEDGBZRPN4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: pickle.reduce and deconstruct functions

2020-02-09 Thread Andrew Barnert via Python-Dev
Some additional things that might be worth doing. I believe this exposes enough 
to allow people to build an object graph walker out of the `pickle`/`copy` 
protocol without having to access fragile internals, and without restricting 
future evolution of the internals of the protocol. See [the same -ideas 
thread][1] again for details on how it could be used and why.

These would all be changes to the `copy` module, together with the changes to 
`pickle` and `copyreg` in the previous message:

class Memo(dict):
"""Memo is a mapping that can be used to do memoization exactly the 
same way
deepcopy does, so long as you only use ids as keys and only use these 
operations:

y = memo.get(id(x), default)
memo[id(x)] = y
memo.keep_alive(x)
"""
def keep_alive(self, x):
self.setdefault(id(self), []).append(x)

def reconstruct(x, memo: Memo, reduction, *, recurse=deepcopy):
"""reconstruct(x, memo, reduction, recurse=recursive_walker)
Constructs a new object from the reduction by calling recursive_walker
on each value. The reduction should have been obtained as 
pickle.reduce(x)
and the memo should be a Memo instance (which will be passed to each
recursive_walker call).
"""
return _reconstruct(x, memo, *reduction, deepcopy=recurse)

def copier(cls):
"""copier(cls) -> func
Returns a function func(x, memo, recurse) that can be used to copy 
objects
of type cls without reducing and reconstructing them, or None if there 
is no
such function.
"""
if c := _deepcopy_dispatch.get(cls):
return c
if issubclass(cls, type):
return _deepcopy_atomic

Also, all of the private functions that are stored in `_deepcopy_dispatch` 
would rename their `deepcopy` parameter to `recurse`, and the two that don't 
have such a parameter would add it.


  [1]: 
https://mail.python.org/archives/list/python-id...@python.org/thread/RTZGM7L7JOTKQQICN6XDSLOAMU4A62CA/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VSHNDMSQ7XSIICVTQM2LPCDDPX3Q7I2M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] pickle.reduce and deconstruct functions

2020-02-08 Thread Andrew Barnert via Python-Dev
This was [posted on -ideas][1], but apparently many people didn't see it 
because of the GMane migration going on at exactly the same time. At any rate, 
Antoine Pitrou suggested it should be discussed on -dev instead. And this gives 
me a chance to edit it (apparently it was markdown-is enough to confuse 
Hyperkitty and turn into a mess).

Pickling uses an extensible protocol that lets any class determine how its 
instances can be deconstructed and reconstructed. Both `pickle` and `copy` use 
this protocol, but it could be useful more generally. Unfortunately, to use it 
more generally requires relying on undocumented details. I think we should 
expose a couple of helpers to fix that: 

# Return the same (shallow) reduction tuple that pickle.py, copy.py, and 
_pickle.c would use
pickle.reduce(obj) -> (callable, args[, state[, litems[, ditem[, 
statefunc)

# Return a callable and arguments to construct a (shallow) equivalent object
# Raise a TypeError when that isn't possible
pickle.deconstruct(obj) -> callable, args, kw

So, why do you want these?

There are many cases where you want to "deconstruct" an object if possible. For 
example:

 * Pattern matching depends on being able to deconstruct objects like this
 *  Auto-generating a `__repr__` as suggested in [Chris Angelico's -ideas 
thread][2].
 * Quick&dirty REPL stuff, and deeper reflection stuff using 
`inspect.signature` and friends. 

Of course not every type tells `pickle` what to do in an appropriate way that 
we can use, but a pretty broad range of types do, including (I think; I haven't 
double-checked all of them) `@dataclass`, `namedtuple`, `@attr.s`, many builtin 
and extension types, almost all reasonable types that use `copyreg`, and any 
class that pickles via the simplest customization hook `__getnewargs[_ex]__`. 
That's more than enough to be useful. 

And, just as important, it won't (except in intentionally pathological cases) 
give us a false positive, where a type is correctly pickleable and we think we 
can deconstruct it but the deconstruction is wrong. (For some uses, you are 
going to want to fall back to heuristics that are often right but sometimes 
misleadingly wrong, but I don't think the `pickle` module should offer anything 
like that. Maybe `inspect` should, but I'm not proposing that here.)

The way to get the necessary information isn't fully documented, and neither is 
the way to interpret it. And I don't think it _should_ be documented, because 
it changes every so often, and for good reasons; we don't want anyone writing 
third-party code that relies on those details. Plus, a different Python 
implementation might conceivably do it differently. Public helpers exposed from 
`pickle` itself won't have those problems. 

Here's a first take at the code.

def reduce(obj, proto=pickle.DEFAULT_PROTOCOL):
"""reduce(obj) -> (callable, args[, state[, litems[, ditem[, 
statefunc)
Return the same reduction tuple that the pickle and copy modules use
"""
cls = type(obj)
if reductor := copyreg.dispatch_table.get(cls): 
return reductor(obj) # Note that this is not a special method call 
(not looked up on the type)
if reductor := getattr(obj, "__reduce_ex__"):
return reductor(proto)
if reductor := getattr(obj, "__reduce__"):
return reductor() raise TypeError(f"{cls.__name__} objects are not 
reducible")

def deconstruct(obj):
"""deconstruct(obj) -> callable, args, kwargs
callable(_args, **kwargs) will construct an equivalent object
"""
reduction = reduce(obj)

# If any of the optional members are included, pickle/copy has to 
# modify the object after construction, so there is no useful single 
# call we can deconstruct to.
if any(reduction[2:]):
raise TypeError(f"{type(obj).__name__} objects are not 
deconstrutable")

func, args,_ _ = reduction

# Many types (including @dataclass, namedtuple, and many builtins)
# use copyreg.__newobj__ as the constructor func. The args tuple is 
# the type (or, when appropriate, some other registered 
# constructor) followed by the actual args. However, any function 
# with the same name will be treated the same way (because under the 
# covers, this is optimized to a special opcode). 
if func.__name__ == "__newobj__":
return args[0], args[1:], {}

# (Mainly only) used by types that implement __getnewargs_ex__ use 
# copyreg.__newobj_ex__ as the constructor func. The args tuple 
# holds the type, *args tuple, and **kwargs dict. Again, this is 
# special-cased by name. 
if func.__name__ == "__newobj_ex__":
return args

# If any other special copyreg functions are added in the future, 
# this code won't k

Re: [Python-Dev] bitfields - short - and xlc compiler

2016-03-20 Thread Andrew Barnert via Python-Dev
On Mar 20, 2016, at 09:07, Michael Felt  wrote:
> 
>> On 2016-03-18 05:57, Andrew Barnert via Python-Dev wrote:
>> Yeah, C99 (6.7.2.1) allows "a qualified or unqualified version of _Bool, 
>> signed int, unsigned int, or some other implementation-defined type", and 
>> same for C11. This means that a compiler could easily allow an 
>> implementation-defined type that's identical to and interconvertible with 
>> short, say "i16", to be used in bitfields, but not short itself.
>> 
>> And yet, gcc still allows short "even in strictly conforming mode" (4.9), 
>> and it looks like Clang and Intel do the same.
>> 
>> Meanwhile, MSVC specifically says it's illegal ("The type-specifier for the 
>> declarator must be unsigned int, signed int, or int") but then defines the 
>> semantics (you can't have a 17-bit short, bit fields act as the underlying 
>> type when accessed, alignment is forced to a boundary appropriate for the 
>> underlying type). They do mention that allowing char and long types is a 
>> Microsoft extension, but still nothing about short, even though it's used in 
>> most of the examples on the page.
>> 
>> Anyway, is the question what ctypes should do? If a platform's compiler 
>> allows "short M: 1", especially if it has potentially different alignment 
>> than "int M: 1", ctypes on that platform had better make ("M", c_short, 1) 
>> match the former, right?
>> 
>> So it sounds like you need some configure switch to test that your compiler 
>> doesn't allow short bit fields, so your ctypes build at least skips that 
>> part of _ctypes_test.c and test_bitfields.py, and maybe even doesn't allow 
>> them in Python code.
>> 
>> 
>>>> >>  test_short fails om AIX when using xlC in any case. How terrible is 
>>>> >> this?

> a) this does not look solveable using xlC, and I expect from the comment 
> above re: MSVC, that it will, or should also fail there.

> And, imho, if anything is to done, it is a decision to be made by "Python".

Sure, but isn't that exactly why you're posting to this list?

> b) aka - it sounds like a defect, at least in the test.

Agreed. But I think the test is reasonable on at least MSVC, gcc, clang, and 
icc. So what you need is some way to run the test on those compilers, but not 
on compilers that can't handle it.

So it sounds like you need a flag coming from autoconf that can be tested in C 
(and probably in Python as well) that tells you whether the compiler can handle 
it. And I don't think there is any such flag. 

Which means someone would have to add the configure test. And if people who use 
MSVC, gcc, and clang are all unaffected, I'm guessing that someone would have 
to be someone who cares about xlC or some other compiler, like you.

The alternative would be to just change the docs to make it explicit that using 
non-int bitfields isn't supported but may work in platform-specific ways. If 
you got everyone to agree to that, surely you could just remove the tests, 
right? But if people are actually writing C code that follows the examples on 
the MSVC bitfield docs page, and need to talk to that code from ctypes, I don't 
know if it would be acceptable to stop officially supporting that.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 484: updates to Python 2.7 signature syntax

2016-03-19 Thread Andrew Barnert via Python-Dev
On Mar 19, 2016, at 18:18, Guido van Rossum  wrote:
> 
> Second, https://github.com/python/typing/issues/186. This builds on
> the previous syntax but deals with the other annoyance of long
> argument lists, this time in case you *do* care about the types. The
> proposal is to allow writing the arguments one per line with a type
> comment on each line. This has been implemented in PyCharm but not yet
> in mypy. Example:
> 
>def gcd(
>a,  # type: int
>b,  # type: int
>):
># type: (...) -> int
>

This is a lot nicer than what you were originally discussing (at #1101? I 
forget...). Even more so given how trivial it will be to mechanically convert 
these to annotations if/when you switch an app to pure Python 3.

But one thing: in the PEP and the docs, I think it would be better to pick an 
example with longer parameter names. This example shows that even in the worst 
case it isn't that bad, but a better example would show that in the typical 
case it's actually pretty nice. (Also, I don't see why you wouldn't just use 
the "old" comment form for this example, since it all fits on one line and 
isn't at all confusing.)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bitfields - short - and xlc compiler

2016-03-19 Thread Andrew Barnert via Python-Dev
On Mar 17, 2016, at 18:35, MRAB  wrote:
> 
>> On 2016-03-18 00:56, Michael Felt wrote:
>> Update:
>> Is this going to be impossible?
> From what I've been able to find out, the C89 standard limits bitfields to 
> int, signed int and unsigned int, and the C99 standard added _Bool, although 
> some compilers allow other integer types too. It looks like your compiler 
> doesn't allow those additional types.

Yeah, C99 (6.7.2.1) allows "a qualified or unqualified version of _Bool, signed 
int, unsigned int, or some other implementation-defined type", and same for 
C11. This means that a compiler could easily allow an implementation-defined 
type that's identical to and interconvertible with short, say "i16", to be used 
in bitfields, but not short itself.

And yet, gcc still allows short "even in strictly conforming mode" (4.9), and 
it looks like Clang and Intel do the same. 

Meanwhile, MSVC specifically says it's illegal ("The type-specifier for the 
declarator must be unsigned int, signed int, or int") but then defines the 
semantics (you can't have a 17-bit short, bit fields act as the underlying type 
when accessed, alignment is forced to a boundary appropriate for the underlying 
type). They do mention that allowing char and long types is a Microsoft 
extension, but still nothing about short, even though it's used in most of the 
examples on the page.

Anyway, is the question what ctypes should do? If a platform's compiler allows 
"short M: 1", especially if it has potentially different alignment than "int M: 
1", ctypes on that platform had better make ("M", c_short, 1) match the former, 
right?

So it sounds like you need some configure switch to test that your compiler 
doesn't allow short bit fields, so your ctypes build at least skips that part 
of _ctypes_test.c and test_bitfields.py, and maybe even doesn't allow them in 
Python code.


>> test_short fails om AIX when using xlC in any case. How terrible is this?
>> 
>> ==
>> FAIL: test_shorts (ctypes.test.test_bitfields.C_Test)
>> --
>> Traceback (most recent call last):
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py",
>> line 48, in test_shorts
>>  self.assertEqual((name, i, getattr(b, name)), (name, i,
>> func(byref(b), name)))
>> AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)
>> 
>> First differing element 2:
>> -1
>> 1
>> 
>> - ('M', 1, -1)
>> ?  -
>> 
>> + ('M', 1, 1)
>> 
>> --
>> Ran 440 tests in 1.538s
>> 
>> FAILED (failures=1, skipped=91)
>> Traceback (most recent call last):
>>File "./Lib/test/test_ctypes.py", line 15, in 
>>  test_main()
>>File "./Lib/test/test_ctypes.py", line 12, in test_main
>>  run_unittest(unittest.TestSuite(suites))
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py",
>> line 1428, in run_unittest
>>  _run_suite(suite)
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py",
>> line 1411, in _run_suite
>>  raise TestFailed(err)
>> test.test_support.TestFailed: Traceback (most recent call last):
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py",
>> line 48, in test_shorts
>>  self.assertEqual((name, i, getattr(b, name)), (name, i,
>> func(byref(b), name)))
>> AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)
>> 
>> First differing element 2:
>> -1
>> 1
>> 
>> - ('M', 1, -1)
>> ?  -
>> 
>> + ('M', 1, 1)
>> 
>> 
>> 
>> 
>>> On 17-Mar-16 23:31, Michael Felt wrote:
>>> a) hope this is not something you expect to be on -list, if so - my
>>> apologies!
>>> 
>>> Getting this message (here using c99 as compiler name, but same issue
>>> with xlc as compiler name)
>>> c99 -qarch=pwr4 -qbitfields=signed -DNDEBUG -O -I. -IInclude
>>> -I./Include -I/data/prj/aixtools/python/python-2.7.11.2/Include
>>> -I/data/prj/aixtools/python/python-2.7.11.2 -c
>>> /data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c
>>> -o
>>> build/temp.aix-5.3-2.7/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.o
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field M must be of type signed int,
>>> unsigned int or int.
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field N must be of type signed int,
>>> unsigned int or int.
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field O must be of type signed int,
>>> unsigned int or int.
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field P must be of type signed int,
>>> unsigned int or int.
>>> "/dat

Re: [Python-Dev] Python should be easily compilable on Windows with MinGW

2016-02-26 Thread Andrew Barnert via Python-Dev
One alternative to consider is using Cygwin. A complete Cygwin environment, 
including a GCC toolchain, is pretty small. And it can build a *nix-style 
CPython that works inside the Cygwin environment. That may not be sufficient 
for a lot of uses, but for your purpose, it should be.

Another alternative, as crazy as it may sound, is to get an AWS-Free-Tier EC2 
instance and develop on that.

Or, of course, buy an ancient used laptop and install linux natively.

Obviously none of these are ideal, but they may still be better for you than 
waiting for a complete MinGW port of Python or a smaller MSVC toolchain.

> On Feb 26, 2016, at 02:12, Mathieu Dupuy  wrote:
> 
> Hi.
> I am currently working on adding some functionality on a standard
> library module (http://bugs.python.org/issue15873). The Python part
> went fine, but now I have to do the C counterpart, and I have ran into
> in several problems, which, stacked up, are a huge obstacle to easily
> contribute further. Currently, despite I could work, I can't go
> further
> on my patch.
> 
> I am currently working in very limited network, CPU and time
> ressources* which are quite uncommon in the western world, but are
> much less in the rest of the world. I have a 2GB/month mobile data
> plan and a 100KB/s speed. For the C part of my patch, I should
> download Visual Studio. The Express Edition 2015 is roughly 9GB. I
> can't afford that.
> 
> I downloaded Virtualbox and two Linux netinstall (Ubuntu 15.10 and
> Fedora 23). Shortly, I couldn't get something working quickly and
> simply (quickly = less than 2 hours, downloading time NOT included,
> which is anyway way too already much). What went wrong and why it went
> wrong could be a whole new thread and is outside of the scope of this
> message.
> Let me precise this : at my work I use many virtualbox instances
> automatically fired and run in parallel to test new deployments and
> run unittests. I like this tool,
> but despite its simple look, it (most of the time) can not be used
> simply by a profane. The concepts it requires you to understand are
> not intuitive at first sight and there is *always* a thing that go
> wrong (guest additions, mostly).(for example : Ubuntu and Virtualbox
> shipped for a moment a broken version of mount.vboxsf, preventing
> sharing folder to mount. Despite it's fixed, the broken releases
> spread everywhere and you may encounter them a lot in various Ubuntu
> and Virtualbox version. I downloaded the last versions of both and I
> am yet infected. https://www.virtualbox.org/ticket/12879). I could do
> whole new thread on why you can't ask newcomers to use Virtualbox
> (currently, at least).
> 
> I ran into is a whole patch set to make CPython compile on MinGW
> (https://bugs.python.org/issue3871#msg199695). But it is not denying
> it's very experimental, and I know I would again spent useless hours
> trying to get it work rather than joyfully improving Python, and
> that's exactly what I do not want to happen.
> 
> Getting ready to contribute to CPython pure python modules from an
> standard, average mr-everyone Windows PC for a beginner-to-medium
> contributor only require few megabytes of internet and few minutes of his
> time: getting a tarball of CPython sources (or cloning the github CPython
> mirror)**, a basic text editor and msys-git. The step further, if doing
> some -even basic- C code is required, implies downloading 9GB of Visual
> Studio and countless hours for it to be ready to use.
> I think downloading the whole Visual Studio suite is a huge stopper to
> contribute further for an average medium-or-below-contributor.
> 
> I think (and I must not be the only one since CPython is to be moved
> to github), that barriers to contribute to CPython should be set to
> the lowest.
> Of course my situation is a bit special but I think it represents
> daily struggle of a *lot* of non-western programmer (at least for
> limited internet)(even here in Australia, landline limited internet
> connections are very common).
> It's not a big deal if the MinGW result build is twenty time slower or
> if some of the most advanced modules can't be build. But everyone
> programmer should be able to easily make some C hacks and get them to
> work.
> 
> Hoping you'll be receptive to my pleas,
> Cheers
> 
> 
> * I am currently picking fruits in the regional Australia. I live in a van
> and have internet through with smartphone through an EDGE connection. I can
> plug the laptop in the farm but not in the van.
> ** No fresh programmer use mercurial unless he has a gun pointed on his
> head.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/abarnert%40yahoo.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

Re: [Python-Dev] Buffer overflow bug in GNU C's getaddrinfo()

2016-02-17 Thread Andrew Barnert via Python-Dev
On Feb 17, 2016, at 10:44, MRAB  wrote:
> 
> Is this something that we need to worry about?
> 
> Extremely severe bug leaves dizzying number of software and devices vulnerable
> http://arstechnica.com/security/2016/02/extremely-severe-bug-leaves-dizzying-number-of-apps-and-devices-vulnerable/

Is there a workaround that Python and/or Python apps should be doing, or is 
this just a matter of everyone on glibc 2.9+ needs to update their glibc?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode v2

2016-02-14 Thread Andrew Barnert via Python-Dev
On Feb 14, 2016, at 19:05, Guido van Rossum  wrote:
> 
> I think it's probably too soon to discuss on python-dev, but I do
> think that something like this could be attempted in 3.6 or (more
> likely) 3.7, if it really is faster.
> 
> An unfortunate issue however is that many projects seem to make a
> hobby of hacking bytecode.
> All those projects would have to be totally
> rewritten in order to support the new wordcode format (as opposed to
> just having to be slightly adjusted to support the occasional new
> bytecode opcode).

This is part of why I suggested, on -ideas, that we should add a 
mutating/assembling API to the dis module. People argued that such an API would 
make the bytecode format more fragile, but the exact opposite is true.

At the dis level, everything is unchanged by wordcode. Or by Serhiy's 
args-packed-in-opcode. So, if the dis module could do everything for people 
that, say, the third-party byteplay module does (which wouldn't take much), so 
things like coverage.py, or the various special-case optimizer decorators on 
PyPI and ActiveState, etc. could all be written to deal with the dis module 
format rather than raw bytecode, we could make changes like this without 
risking nearly as much breakage.

Anyway, this obviously wouldn't help the transition for 3.6. But improving dis 
in 3.6, with a warning that raw bytecode might start changing more frequently 
and/or radically in the future now that there's less reason to depend on it, 
might help if wordcode were to go into 3.7.

> All of which means that it's more likely to make it into 3.7. See you
> on python-ideas!
> 
> --Guido
> 
>> On Sun, Feb 14, 2016 at 4:20 PM, Demur Rumed  wrote:
>> Saw recent discussion:
>> https://mail.python.org/pipermail/python-dev/2016-February/143013.html
>> 
>> I remember trying WPython; it was fast. Unfortunately it feels it came at
>> the wrong time when development was invested in getting py3k out the door.
>> It also had a lot of other ideas like *_INT instructions which allowed
>> having oparg to be a constant int rather than needing to LOAD_CONST one.
>> Anyways I'll stop reminiscing

Despite the name (and inspiration), my fork has very little to do with WPython. 
I'm just focused on simpler (hopefully = faster) fetch code; he started with 
that, but ended up going the exact opposite direction, accepting more 
complicated (and much slower) fetch code as a reasonable cost for drastically 
reducing the number of instructions. (If you double the 30% fetch-and-parse 
overhead per instruction, but cut the number of instructions to 40%, the net is 
a huge win.)



>> 
>> abarnert has started an experiment with wordcode:
>> https://github.com/abarnert/cpython/blob/c095a32f2a68ac708466b9c64906cc4d0f5de1ee/Python/wordcode.md
>> 
>> I've personally benchmarked this fork with positive results. This experiment
>> seeks to be conservative-- it doesn't seek to introduce new opcodes or
>> combine BINARY_OP's all into a single op where the currently
>> unused-in-wordcode arg then states the kind of binary op (à la COMPARE_OP).
>> I've submitted a pull request which is working on fixing tests & updating
>> peephole.c
>> 
>> Bringing this up on the list to figure out if there's interest in a basic
>> wordcode change. It feels like there's no downsides: faster code, smaller
>> bytecode, simpler interpretation of bytecode (The Nth instruction starts at
>> the 2Nth byte if you count EXTENDED_ARG as an instruction). The only
>> downside is the transitional cost
>> 
>> What'd be necessary for this to be pulled upstream?
>> 
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/abarnert%40yahoo.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-12 Thread Andrew Barnert via Python-Dev
On Feb 12, 2016, at 12:58, Glenn Linderman  wrote:
> 
>> On 2/12/2016 12:06 PM, Chris Barker wrote:
>> As for the SS# example -- it seems a bad idea to me to store a SS# number as 
>> an integer anyway -- so all the weird IDs etc. formats aren't really 
>> relevant...
> 
> SS#... why not integer?  Phone#... why not integer? There's a lot of nice 
> digit-division conventions for phone#s in different parts of the world.

I'm the one who brought up the SSN example--and, as I said at the time, I 
almost certainly wouldn't have done that in Python. I was maintaining tests for 
a service that stored SSNs as integers (which I think is a mistake, but I 
couldn't change it), a automatically-generated strongly-typed interface to that 
service (which is good), and no easy way to wrap or hook that interface (which 
is bad). In Python, it's hard to imagine how I'd end up with a situation where 
I couldn't wrap or hook the interface and treat SSNs as strings in my test 
code. (In fact, for complicated tests, I did exactly that in Python to make 
sure they were correct, then ported them over to integrate with the test 
suite...)

And anyway, the only point was that I've actually used a grouping that isn't 
"every 3 digits" and it didn't end the world. I think everyone agrees that some 
such groupings will come up--even if not every specific examples is good, there 
are some that are. Even the people who want something more conservative than 
the PEP doesn't seem to be taking that position--they may not want double 
underscores, or "123_456_j", but they're fine with "if yuan > _:".

So, either we try to anticipate every possible way people might want to group 
numbers and decide which ones are good or bad, or we just let the style guide 
say "meaningful group of digits" and let each developer decide what counts as 
"meaningful" for their application. Does anyone really want to argue for the 
former? 

If not, why not just settle that and go back to bikeshedding the cases that 
*are* contended, like "123_456_j"? (I'm happy either way, as long as the 
grammar rule is dead simple and the PEP 8 rule is pretty simple, but I know 
others have strong, and conflicting, opinions on that.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Thursday, February 11, 2016 8:10 PM, Glenn Linderman  
wrote:

>On 2/11/2016 7:56 PM, David Mertz wrote:
>
>Great PEP overall. We definitely don't want the restriction to grouping 
>numbers only in threes. South Asian crore use grouping in twos.
>>https://en.m.wikipedia.org/wiki/Crore
>>
>Interesting... 3 digits in the least significant group, and _then_
   by twos. Wouldn't have predicted that one! Never bumped into that
   notation before!


The first time I used underscore separators in any language, it was a test 
script for a server that wanted social security numbers as integers instead of 
strings, like 123_45_6789.[^1] 

Which is why I suggested the style guideline should just say "meaningful 
grouping of digits", rather than try to predict what counts as "meaningful" for 
every program.


[^1] Of course in Python, it's usually trivial to stick a shim in between the 
database and the model thingy so I could just pass in "123-45-6789", so I don't 
expect to ever need this specific example.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Andrew Barnert via Python-Dev
On Thursday, February 11, 2016 7:20 PM, Stephen J. Turnbull 
 wrote:



> I think we should keep it around forever.  Even my slowest colleagues
> are learning that they should record their seeds and PRNG algorithms
> for reproducibility's sake. :-)

+1

> For that matter, restore Wichmann-Hill.

So you can write code that works on 2.3 and 3.6, but not 3.5?

I agree that it shouldn't have gone away, but I think it may be too late for 
adding it back to help too much.

> Both should be clearly marked as "use only for reproducing previous
> bitstreams" (eg, in a package random.deprecated_generators).


I like the random.deprecated_generators idea.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Thursday, February 11, 2016 10:35 AM, Jeff Hardy  wrote:

>On Thu, Feb 11, 2016 at 10:15 AM, Andrew Barnert via Python-Dev 
> wrote:
>
>>That's a good point: we need style rules for PEP 8.


...
>>It might be simpler to write a "whitelist" than a "blacklist" of all the ugly 
>>things people might come up with, and then just give a bunch of examples 
>>instead of a bunch of rules. Something like this:
>>
>>While underscores can legally appear anywhere in the digit string, you should 
>>never use them for purposes other than visually separating meaningful digit 
>>groups like thousands, bytes, and the like.
>>
>>123456_789012: ok (millions are groups, but thousands are more common, 
>> and 6-digit groups are readable, but on the edge)
>>123_456_789_012: better
>>123_456_789_012_: bad (trailing)
>>1_2_3_4_5_6: bad (too many)
>>1234_5678: ok if code is intended to deal with east-Asian numerals (where 
>> 1 is a standard grouping), bad otherwise
>>3__141_592_654: ok if this represents a fixed-point fraction (obviously 
>> bad otherwise)
>>123.456_789e123: good
>>123.456_789e1_23: bad (never useful in exponent)
>>0x1234_5678: good
>>0o123_456: good
>>0x123_456_789: bad (3 hex digits is usually not a meaningful group)
>

>I imagine that for whatever "bad" grouping you can suggest, someone, 
>somewhere, has a legitimate reason to use it. 

That's exactly why we should just have bad examples in the style guide, rather 
than coming up with style rules that try to strongly discourage them (or making 
them syntax errors).

>Any rule more complex than "Use underscores in numeric literals only when the 
>improve clarity" is unnecessarily prescriptive.

Your rule doesn't need to be stated at all. It's already a given that you 
shouldn't add semantically-meaningless characters anywhere unless they improve 
clarity

I don't think saying that they're for "visually separating meaningful digit 
groups like thousands, bytes, and the like" is unnecessarily prescriptive. If 
someone comes up with a legitimate use for something we've never anticipated, 
it will almost certainly just be a way of grouping digits that's meaningful in 
a way we didn't anticipate. And, if not, it's just a style guideline, so it 
doesn't have to apply 100% of the time. If someone really comes up with 
something that has nothing to do with grouping digits, all the style guideline 
will do is make them stop and think about whether it really is a good use of 
underscores--and, if it is, they'll go ahead and do it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 10:15, Andrew Barnert via Python-Dev 
 wrote:
> 
> That's a good point: we need style rules for PEP 8.

One more point: should the tutorial mention underscores? It looks like the 
intro docs for a lot of the other languages do. And it would only take one 
short sentence in 3.1.1 Numbers to say that you can use underscores to make 
large numbers like 123_456.789_012 more readable.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 09:39, Terry Reedy  wrote:
> 
> If trailing _ is allowed, to simplify the implementation, I would like PEP 8, 
> while on the subject, to say something like "While trailing _s on numbers are 
> allowed, to simplify the implementation, they serve no purpose and are 
> strongly discouraged".

That's a good point: we need style rules for PEP 8.

But I think everything that's just obviously pointless (like putting an 
underscore between every pair of digits, or sprinkling underscores all over a 
huge number to make ASCII art), or already handled by other guidelines (e.g., 
using a ton of underscores to "line up a table" is the same as using a ton of 
spaces, which is already discouraged) doesn't really need to be covered. And I 
think trailing underscores probably fall into that category.

It might be simpler to write a "whitelist" than a "blacklist" of all the ugly 
things people might come up with, and then just give a bunch of examples 
instead of a bunch of rules. Something like this:

While underscores can legally appear anywhere in the digit string, you should 
never use them for purposes other than visually separating meaningful digit 
groups like thousands, bytes, and the like.

123456_789012: ok (millions are groups, but thousands are more common, and 
6-digit groups are readable, but on the edge)
123_456_789_012: better
123_456_789_012_: bad (trailing)
1_2_3_4_5_6: bad (too many)
1234_5678: ok if code is intended to deal with east-Asian numerals (where 
1 is a standard grouping), bad otherwise
3__141_592_654: ok if this represents a fixed-point fraction (obviously bad 
otherwise)
123.456_789e123: good
123.456_789e1_23: bad (never useful in exponent)
0x1234_5678: good
0o123_456: good
0x123_456_789: bad (3 hex digits is usually not a meaningful group)

The one case that seems contentious is "123_456_j". Honestly, I don't care 
which way that goes, and I'd be fine if the PEP left out any mention of it, but 
if people feel strongly one way or the other, the PEP could just give it as a 
good or a bad example and that would be enough to clarify the intention.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 00:22, Georg Brandl  wrote:
> 
> Allowing underscores in string arguments to the ``Decimal`` constructor.  It
>  could be argued that these are akin to literals, since there is no Decimal
>  literal available (yet).

I'm +1 on this. Partly for consistency (see below)--but also, one of the use 
cases for Decimal is when you need more precision than float, meaning you'll 
often have even more digits to separate.

> * Allowing underscores in string arguments to ``int()`` with base argument 0,
>  ``float()`` and ``complex()``.

+1, because these are actually defined in terms of literals. For example, under 
int, "Base 0 means to interpret exactly as a code literal". This isn't actually 
quite true, because "-2" is not an integer literal but is accepted here--but 
see float for an example that *is* rigorously defined, and still defers to 
literal syntax and semantics.___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 02:13, Steven D'Aprano  wrote:
> 
>> On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:

>> They're both presented as something the syntax allows, and neither one 
>> looks like something I'd ever want to write, much less promote in a 
>> style guide or something, but neither one screams out as something 
>> that's so heinous we need to complicate the language to ensure it 
>> raises a SyntaxError. Yes, that's my opinion, but do.you really have a 
>> different opinion about any part of that?
> 
> I don't think the rule "underscores must occur between digits" is 
> complicating the specification.

That rule isn't in the specification in the PEP, except as one of the 
alternatives rejected for being "too restrictive". It's also not the rule you 
were suggesting in your previous email, arguing where you insisted that you 
wanted something "more liberal". I also don't understand why you're presenting 
this whole thing as an argument against my response, which was suggesting that 
whatever rule we choose should be simpler than what's in the PEP, when that's 
also (apparently, now) your position.

> It is *less* complicated to explain this 
> rule than to give a whole lot of special cases

Sure. Your rule is about as complicated as the Swift rule, and both are much 
less complicated than the PEP. I'm fine with either one, because, as I said, 
the edge cases don't matter to me nearly as much as having a rule that's easy 
to keep it my head and easy to lex. The only reason I specifically proposed the 
Swift rule instead of one of the other simple rules is that it seemed the most 
"liberal", which the PEP was in favor of, and and it has precedent in more 
other languages. But, in favor of your version, almost every language uses some 
variation of "you can put underscores between digits" as the "tutorial-level" 
explanation and rationale.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-10 Thread Andrew Barnert via Python-Dev
On Feb 10, 2016, at 16:21, Steven D'Aprano  wrote:
> 
>> On Wed, Feb 10, 2016 at 03:45:48PM -0800, Andrew Barnert via Python-Dev 
>> wrote:
>> On Feb 10, 2016, at 14:20, Georg Brandl  wrote:
>> 
>> First, general questions: should the PEP mention the Decimal constructor? 
>> What about int and float (I'd assume int(s) continues to work as always, 
>> while int(s, 0) gets the new behavior, but if that isn't obviously true, it 
>> may be worth saying explicitly).
>> 
>>> * Trailing underscores are not allowed, because they look confusing and 
>>> don't
>>> contribute much to readability.
>> 
>> Why is "123_456_" so ugly that we have to catch it, when 
>> "1___2_345__6" is just fine,
> 
> It's not just fine, it's ugly as sin, but it shouldn't be a matter for 
> the parser to decide a style-issue.

Exactly. So why should it be any more of a matter for the parser to decide that 
"123_456_" is illegal? Leave that in the style guide, and keep the parser, and 
the reference documentation, as simple as possible.

>> or "123e__+456"?
> 
> That I would prohibit.

The PEP allows that. The simpler rule used by Swift and Rust prohibits it.

>> More to the point, 
>> if we really need an extra rule, and more complicated BNF, to outlaw 
>> this case, I don't think we want a liberal design at all.
> 
> I think "underscores can occur between any two digits" is pretty 
> liberal, since it allows multiple underscores, and allows grouping in 
> any size group (including mixed sizes, and stupid sizes like 1).

The PEP calls that a type-2 conservative proposal, and uses "liberal" to mean 
that underscores can appear in places that aren't between digits. I don't think 
we want that liberalism, especially if it requires 5 rules instead of 1 to get 
it right.

Again, Swift and Rust only allow underscores in the digit part of integers, and 
the up to three digit parts of floats, and the only rule they impose is no 
leading underscore. (In some caass they lead to ambiguity, in others they 
don't, but it's easier to just always ban them.) I don't see anything wrong 
with that rule. The fact that it doesn't allow "1.2e_+3" seems fine. The fact 
that it doesn't prevent "123_" seems fine also. It's not about being as liberal 
as possible, or as restrictive as possible, because those edge cases just don't 
matter, so being as simple as possible seems like an obvious win.

>> Also, notice that Swift, Rust, and D all show examples with trailing 
>> underscores in their references, and they don't look particularly out 
>> of place with the other examples.
> 
> That's a matter of opinion.

Sure, but it's apparently the opinion of the people who designed and/or 
documented this feature in three out of the four languages I looked at (aka 
every language but Perl), not mine.

And honestly, are you really claiming that in your opinion, "123_456_" is worse 
than all of their other examples, like "1_23__4"?

They're both presented as something the syntax allows, and neither one looks 
like something I'd ever want to write, much less promote in a style guide or 
something, but neither one screams out as something that's so heinous we need 
to complicate the language to ensure it raises a SyntaxError. Yes, that's my 
opinion, but do.you really have a different opinion about any part of that?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?

2016-02-10 Thread Andrew Barnert via Python-Dev
On Feb 10, 2016, at 15:11, eryk sun  wrote:
> 
> On Wed, Feb 10, 2016 at 2:30 PM, Andrew Barnert via Python-Dev
>  wrote:
>>  [^3]: Say you write a program that assumes it will only be run on Shift-JIS 
>> systems, and you use
>> CreateFileA to create a file named "ハローワールド". The actual bytes you're 
>> sending are cp436
>> for "ânâìü[âÅü[âïâh", so the file on the CD is named, in Unicode, 
>> "ânâìü[âÅü[âïâh".
> 
> Unless the system default was changed or the program called
> SetFileApisToOEM, CreateFileA would decode using the ANSI codepage
> 1252, not the OEM codepage 437 (not 436), i.e.
> "ƒnƒ\x8d\x81[ƒ\x8f\x81[ƒ‹ƒh". Otherwise the example is right. But the
> transcoding strategy won't work in general. For example, if the tables
> are turned such that the ANSI codepage is 932 and the program passes a
> bytes name from codepage 1252, the user on the other end won't be able
> to transcode without error if the original bytes contained invalid
> DBCS sequences that were mapped to the default character, U+30FB.
> This
> transcodes as the meaningless string "\x81E". The user can replace
> that string with "--" and enjoy a nice game of hang man.

Of course there's no way to recover the actual intended filenames if that 
information was thrown out instead of being stored, but that's no different 
from the situation where the user mashed the keyboard instead of typing what 
they intended.

The point remains: the Mac strategy (which is also the linux strategy for 
filesystems that are inherently UTF-16) always generates valid UTF-8, and 
doesn't try to magically cure mojibake but doesn't get in the way of the user 
manually curing it. When the Unicode encoding is lossy, of course the user 
can't cure that, but UTF-8 isn't making it any harder.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-10 Thread Andrew Barnert via Python-Dev
On Feb 10, 2016, at 14:20, Georg Brandl  wrote:

First, general questions: should the PEP mention the Decimal constructor? What 
about int and float (I'd assume int(s) continues to work as always, while 
int(s, 0) gets the new behavior, but if that isn't obviously true, it may be 
worth saying explicitly).

> * Trailing underscores are not allowed, because they look confusing and don't
>  contribute much to readability.

Why is "123_456_" so ugly that we have to catch it, when "1___2_345__6" is 
just fine, or "123e__+456"? More to the point, if we really need an extra rule, 
and more complicated BNF, to outlaw this case, I don't think we want a liberal 
design at all.

Also, notice that Swift, Rust, and D all show examples with trailing 
underscores in their references, and they don't look particularly out of place 
with the other examples.

> There appears to be no reason to restrict the use of underscores otherwise.

What other restrictions are there? I think the only place you've left that's 
not between digits is between the e and the sign. A dead-simple rule like 
Swift's seems better than five separate rules that I have to learn and remember 
that make lexing more complicated and that ultimately amount to the 
conservative rule plus one other place I can put underscores where I'd never 
want to.

> **Group 1: liberal (like this PEP)**
> 
> * D [2]_
> * Perl 5 (although docs say it's more restricted) [3]_
> * Rust [4]_
> * Swift (although textual description says "between digits") [5]_

I don't think any of these are liberal like this PEP.

For example, Swift's actual grammar rule allows underscores anywhere but 
leading in the "digits" part of int literals and all three potential digit 
parts of float literals. That's the whole rule. It's more conservative than 
this PEP in not allowing them outside of digit parts (like between E and +), 
more liberal in allowing them to be trailing, but I'm pretty sure the reason 
behind the design wasn't specifically about how liberal or conservative they 
wanted to be, but about being as simple as possible. Rust's rule seems to be 
equivalent to Swift's, except that they forgot to define exponents anywhere. I 
don't think either of them was trying to be more liberal or more conservative; 
rather, they were both trying to be as simple as possible.

D does go out of its way to be as liberal as possible, e.g., allowing things 
like "0x_1_" that the others wouldn't (they'd treat the "_1_" as a digit part, 
which can't have leading underscores), but it's also more conservative than 
this spec in not allowing underscores between e and the sign.

I think Perl is the only language that allows them anywhere but in the digits 
part.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?

2016-02-10 Thread Andrew Barnert via Python-Dev
On Wednesday, February 10, 2016 6:50 AM, Stephen J. Turnbull 
 wrote:
> Andrew Barnert via Python-Dev writes:
> 
>>  That doesn't mean the problem can't be solved. Apple solved their
>>  equivalent problem, albeit by sacrificing backward compatibility in
>>  a way Microsoft can't get away with. I haven't seen a MacRoman or
>>  Shift-JIS filename since they broke the last holdout
> 
> If you lived where I do, you'd still be seeing both, because you
> wouldn't be able to escape archival files on CD and removable media
> (typically written on Windows boxen). They still work, sort of ==
> same as always, and as far as I know, that's because Apple has *not*
> sacrificed backward compatibility: under the hood, Darwin is still a
> POSIX kernel which thinks of file names and everything else outside of
> memory as bytestreams.


Sure, but the Darwin kernel can't read CDs; that's up to the CD filesystem 
driver.


Anyway, Windows CDs can't cause this problem. Windows CDs use the Joliet 
filesystem,[^1] which stores everything in UCS2.[^2] When you call CreateFileA 
or fopen or _open with bytes, Windows decodes those bytes and stores them as 
UCS2. The filesystem drivers on POSIX platforms have to encode that UCS2 to 
_something_ (POSIX APIs make it very hard for you to deal with filename strings 
like 
"A\0B\0C\0.\0T\0X\0T\0\0\0"...). The linux driver uses a mount option to decide 
how to encode; the OS X driver always uses UTF-8. And every valid UCS2 string 
can be encoded as UTF-8, so you can use unicode everywhere, even in Python 2.

Of course you can have mojibake problems, but that's a different issue,[^3] and 
no worse with unicode than with bytes.[^4]

The same thing is true with NTFS external drives, VFAT USB drives, etc. 
Generally, it's usually not Windows media on *nix systems that break Python 2 
unicode; it's native *nix filesystems where users mix locales.

> One place they *fail very badly* is Shift JIS filenames in zipfiles,
> which nothing provided by Apple can deal with safely, and InfoZip
> breaks too (at least in MacPorts). Yes, I know that is specifically
> disallowed. Feel free to tell 1__ Japanese Windows users.

The good news is, as far as I can tell, it's not disallowed anymore.[^5] So we 
just have to tell them that they shouldn't have been doing it in the past. :)

Anyway, zipfiles are data files as far as the OS is concerned; the fact that 
they contain filenames is no more relevant to the kernel (or filesystem driver 
or userland) than the fact that "List of PDFs to Read This Weekend.txt" 
contains filenames.

PS, everything Apple provides is already using Info-ZIP.


>>  So Python 2 works great on Macs, whether you use bytes or
>>  unicode. But that doesn't help us on Windows, where you can't use
>>  bytes, or Linux, where you can't use Unicode (without surrogate
>>  escape or some other mechanism that Python 2 doesn't have).
> 
> You contradict yourself! ;-)

Yes, as I later realized, sometimes, you _can_ (or at least ought to be able 
to--I haven't actually tried) use Python 2 with unicode everywhere to write 
cross-platform software that actually works on linux, by using backports of 
surrogate-escape and pathlib, and the io module instead of the file type, as 
long as you only need stdlib and third-party modules that support unicode 
filenames. If that does work for at least some apps, then I'm perfectly happen 
to have been wrong earlier. And if catching myself before someone else did 
makes me a flip-flopper, well, I'm not running for president. :P


  [^1]: Except when Vista and 7 mistakenly think your CD is a DVD and use UDF 
instead of ISO9660--but in that case, the encoding is stored in the filesystem 
header, so it's also not a problem.

  [^2]: Actually, despite Microsoft's spec, later versions of Windows store 
UTF-16, even if there are surrogate pairs, or BMP-but-post-UCS2 code points. 
But that doesn't matter here; the linux, Mac, etc. drivers all assume UTF-16, 
which works either way.

  [^3]: Say you write a program that assumes it will only be run on Shift-JIS 
systems, and you use CreateFileA to create a file named "ハローワールド". The actual 
bytes you're sending are cp436 for "ânâìü[âÅü[âïâh", so the file on the CD is 
named, in Unicode, "ânâìü[âÅü[âïâh". So of course the Mac driver encodes that 
to UTF-8 b"ânâìü[âÅü[âïâh". You won't have any problems opening what you 
readdir, or what you copy from a UTF-8 terminal or a UTF-16 Cocoa app like 
Finder, etc. But of course you will have trouble getting your user to recognize 
that name as meaningful, unless you can figure out or guess or prompt the user 
to guess that it needs to be passed through 
s.encode('cp436')

Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?

2016-02-10 Thread Andrew Barnert via Python-Dev
On Wednesday, February 10, 2016 12:47 AM, Victor Stinner 
 wrote:

> > 2016-02-10 9:30 GMT+01:00 Paul Moore :
>>  Whether removing the bytes interface is feasible, given that there's
>>  then no way that works across Python 2 and 3 of writing code that
>>  manipulates the sort of bytes-that-use-multiple-encodings data that
>>  you mention, is a separate issue.

Well, there's a surrogate-escape backport on PyPI (I think there's a standalone 
one, and one in python-future), so you _could_ do everything the same as in 3.x.

Depending on what you're doing, you may also need to use the io module instead 
of file (which may just mean "from io import open", but could mean more work), 
wrap the stdio streams explicitly, manually decode argv, etc. But someone could 
write a six-like module (or add it to six) that does all of that. It may be a 
little slower and more memory-intensive in 2.7 than in 3.x, but for most apps, 
that doesn't matter. The big problem would be third-party libraries (and stdlib 
modules like csv) that want to use bytes in 2.x; convincing them all to support 
full-on-unicode in 2.x might be more trouble than it's worth. Still, if I were 
feeling the pain of maintaining lots of linux-bytes-Windows-unicode-2.7 code, 
I'd try it and see how far I get.

> It's annoying that 8 years after the release of Python 3.0, Python 3
> is still stuck by Python 2 :-(

I understand the frustration, but... time already goes too fast at my age; 
don't skip me ahead almost a whole year to December 2016. :)

Also, unless you're the one guy who actually abandoned 2.6 for 3.0, it's 
probably more useful to count from 2.7, 3.2, or the no-2.8 declaration, which 
are all about 5 years ago.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?

2016-02-10 Thread Andrew Barnert via Python-Dev
On Feb 9, 2016, at 20:17, Stephen J. Turnbull  wrote:

>> It really requires going through all the OS calls and either (a) making 
>> them consistently decode bytes to str using the declared FS encoding 
>> (currently 'mbcs', but I see no reason we can't make it 'utf_8'),
> 
> If it were that easy, it would have been done two decades ago.  I'm no
> fan of Windows[1], but it's obvious that Microsoft has devoted
> enormous amounts of brainpower to the problem of encoding
> rationalization since the early 90s.  I don't think they would have
> missed this idea.

Microsoft spent a lot of time and effort on the idea that UTF-16 (or, 
originally, UCS-2) everywhere was the answer. Never call the A functions (or 
the msvcrt functions that emulate the C and POSIX stdlib), and there's never a 
problem. What if you read filenames out of a text file? No problem; text files 
are UTF-16-BOM. Over a socket? All network protocols are also UTF-16. What if 
you have to read a file written in Unix? Come on, nobody's ever created a 
useful file without Windows. What about Windows 3.1? Uh... that's a problem. 
Also, what happens when Unicode goes over 64k characters? And so on. So their 
grand project failed.

That doesn't mean the problem can't be solved. Apple solved their equivalent 
problem, albeit by sacrificing backward compatibility in a way Microsoft can't 
get away with. I haven't seen a MacRoman or Shift-JIS filename since they broke 
the last holdout (the low-level AppleEvent interface) in 10.7--and most of the 
apps I was using back then don't run on 10.10 without an update. So Python 2 
works great on Macs, whether you use bytes or unicode. But that doesn't help us 
on Windows, where you can't use bytes, or Linux, where you can't use Unicode 
(without surrogate escape or some other mechanism that Python 2 doesn't have).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?

2016-02-09 Thread Andrew Barnert via Python-Dev
On Feb 9, 2016, at 17:37, Steve Dower  wrote:
> 
> Could we perhaps redefine bytes paths on Windows as utf8 and use Unicode 
> everywhere internally?

When you receive bytes from argv, stdin, a text file, a GUI, a named pipe, 
etc., and then use them as a path, Python treating them as UTF-8 would break 
everything.

Plus, the problem only exists in Python 2, and Python is not going to fix 
Unicode support in Python 2, both because it's too late for such a major change 
in Python 2, and because it's probably impossible* (which is why we have Python 
3 in the first place). 

> I really don't like the idea of not being able to use bytes in cross platform 
> code. Unless it's become feasible to use Unicode for lossless filenames on 
> Linux - last I heard it wasn't.

It is, and has been for years. Surrogate escaping solved the linux problem. 
That doesn't help for Python 2, but again, it's too late for Python 2.


* Well, maybe in the future, some linux distros will bite the same bullet OS X 
did and mandate that filesystem drivers must expose UTF-8, doing whatever 
transcoding or other munging is necessary under the covers, to be valid. But 
I'm guessing any such distros will be all-Python-3 long before then, and the 
people using Python 2 will also be using old versions or conservative distros.___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiences with Creating PEP 484 Stub Files

2016-02-09 Thread Andrew Barnert via Python-Dev
On Feb 9, 2016, at 03:44, Phil Thompson  wrote:
> 
> There are a number of things I'd like to express but cannot find a way to do 
> so...
> 
> - objects that implement the buffer protocol

That seems like it should be filed as a bug with the typing repo. Presumably 
this is just an empty type that registers bytes, bytearray, and memoryview, and 
third-party classes have to register with it manually?

> - type objects
> - slice objects

Can't you just use the concrete types type and slice tor these two? I don't 
think you need generic or abstract "any metaclass, whether inheriting from type 
or not" or "any class that meets the slice protocol", do you?

> - capsules

That one seems reasonable. But maybe there should just be a types.Capsule Type 
or types.PyCapsule, and then you can just check that the same as any other 
concrete type?

But how often do you need to verify that something is a capsule, without 
knowing that it's the *right* capsule? At runtime, you'd usually use 
PyCapsule_IsValid, not PyCapsule_CheckExacf, right? So should the type checker 
be tracking the name too?

> - sequences of fixed size (ie. specified in the same way as Tuple)

How would you disambiguate between a sequence of one int and a sequence of 0 or 
more ints if they're both spelled "Sequence[int]"? That isn't a problem for 
Tuple, because it's assumed to be heterogeneous, so Tuple[int] can only be a 
1-tuple. (This was actually discussed in some depth. I thought it would be a 
problem, because some types--including tuple itself--are sometimes used as 
homogenous arbitrary-length containers and sometimes as heterogeneous 
fixed-length containers, but Guido and others had some good answers for that, 
even if I can't remember what they were.)

> - distinguishing between instance and class attributes.

Where? Are you building a protocol that checks the data members of a type for 
conformance or something? If so, why is an object that has "spam" and "eggs" as 
instance attributes but "cheese" as a class attribute not usable as an object 
conforming to the protocol with all three attributes? (Also, does @property 
count as a class or instance attribute? What about an arbitrary data 
descriptor? Or a non-data descriptor?)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue #26204: compiler now emits a SyntaxWarning on constant statement

2016-02-09 Thread Andrew Barnert via Python-Dev
On Tuesday, February 9, 2016 8:14 AM, Michel Desmoulin 
 wrote:

> I give regular Python trainings and I see similar errors regularly such as:
> 
> - not returning something;
> - using something without putting the result back in a variable.
> 
> However, these are impossible to warn about.
> 
> What's more, I have yet to see somebody creating a constant and not 
> doing anything with it. I never worked with Ruby dev though.
> 

> My sample of dev is not big enough to be significant, but I haven't met 
> this issue yet. I still like the idea, anything making Python easier for 
> beginers is a good thing for me.

What idea do you like? Somehow warning about the things that are impossible to 
warn about? Or warning about something different that isn't any of the things 
your novices have faced? Or...?

> One particular argument against it is the use of linters, but you must 
> realize most beginers don't use linters.

That doesn't mean the compiler should do everything linters do.

Rank beginners are generally writing very simple programs, where the whole 
thing can be visualized at once, so many warnings aren't relevant. And they 
haven't learned many important language features, so many warnings are 
relevant, but they aren't prepared to deal with them (e.g., global variables 
everywhere because they haven't learned to declare functions yet). As a 
teacher, do you want to explain all those warnings to them? Or teach them the 
bad habit of ignoring warnings? Or just not teach them to use linters (or 
static type checkers, or other such tools) until they're ready to write code 
that should pass without warnings?

Part of learning to use linters effectively is learning to configure them. 
That's almost certainly not something you want to be teaching beginners when 
they're just starting out. But if the compiler started adding a bunch of 
warnings that people had to configure, a la gcc, you'd be forced to teach them 
right off the bat.

And meanwhile, once past the initial stage, many beginners _do_ use linters, 
they just don't realize it. If you use PyCharm or Eclipse/PyDev or almost any 
IDE except IDLE, it may be linting in the background and showing you the 
results as inline code hints, or in some other user-friendly way, or at least 
catching some of the simpler things a linter would check for. Whether you want 
to use those tools in your teaching is up to you, but they exist. And if they 
need any support from the compiler to do their job better, presumably they'd 
ask for it.

> They are part of a toolkit you learn to use 

> on the way, but not something you start with. Besides, many people using 
> Python are not dev, and will just never take the time to use linters, 
> not learn about them.


If people who aren't going to go deep enough into Python to write scripts 
longer than a page don't need linters, then they certainly don't need a bunch 
of warnings from the compiler either.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue #26204: compiler now emits a SyntaxWarning on constant statement

2016-02-08 Thread Andrew Barnert via Python-Dev
> On Feb 8, 2016, at 11:13, Guido van Rossum  wrote:
> 
>> On Mon, Feb 8, 2016 at 9:44 AM, Victor Stinner  
>> wrote:
>> I changed the Python compiler to ignore any kind "constant
>> expressions", whereas it only ignored strings and integers before:
>> http://bugs.python.org/issue26204
>> 
>> The compiler now also emits a SyntaxWarning on such case. IMHO the
>> warning can help to detect bugs for developers who just learnt Python.
> 
> Hum. I'm not excited by this idea. It is not bad syntax. Have you
> actually seen newbies who were confused by such things?

This does overlap to some extent with a problem that newbies *do* get confused 
by (and that transplants from Ruby don't find confusing, but do keep 
forgetting): writing an expression as the last statement in a function and then 
getting a TypeError or AttributeError about NoneType from the caller. Victor's 
example of a function that was presumably meant to return False, but instead 
just evaluates False and returns None, does happen.

But often, that last expression isn't a constant, but something like self.y - 
self.x. So I'm not sure how much this warning would help that case. In fact, it 
might add to the confusion if sometimes you get a warning and sometimes you 
don't. (And you wouldn't want a warning about any function with no return whose 
last statement is an expression, because often that's perfectly reasonable 
code, where the expression is a mutating method call, like 
self.spam.append(arg).)

Also, there are plenty of other common newbie/transplant problems that are 
similar to this one but can't be caught with a warning, like just referencing a 
function or method instead of calling it because you left the parens off. 
That's *usually* a bug, but not always--it could be a LBYL check for an 
attribute's presence, for example.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows: Remove support of bytes filenames in the os module?

2016-02-08 Thread Andrew Barnert via Python-Dev
On Monday, February 8, 2016 9:11 AM, Alexander Walters 
 wrote:


> 
> On 2/8/2016 12:02, Brett Cannon wrote:
>> 
>> 
>>  If Unicode string don't work in Python 2 then what is Python 2/3 to do 
>>  as a cross-platform solution if we completely remove bytes support in 
>>  Python 3? Wouldn't that mean there is no common type between Python 2 
>>  & 3 that one can use which will work with the os module except native 
>>  strings (which are difficult to get right)?
> 
> The only solution then would be to do `if not PY3: arg = 
> arg.encode(...);; os.SOMEFUNC(arg)`, pardon my psudocode.  
That's exactly what you _don't_ want to do.

More generally, the assumption here is wrong. 

It's not true that you can't use Unicode for Window filenames on Python 2. What 
is true is that you have to be a lot more careful about using Unicode 
_consistently_. And that Python 2 gives you very little help in doing so. And 
some third-party modules may make it harder on you. But if you always use 
unicode, `os.listdir(u'.')` calls FindFirstFileW instead of FindFirstFileA and 
gives you back unicode filenames, os.stat or open call _wstat or _wopen with 
those unicode filenames, etc.

The problem is that on POSIX, you're often better off using str everywhere, 
because Python 2.7 doesn't do surrogate escape. And once you're using str on 
one platform/unicode on the other for filenames, it gets very easy to mix str 
and unicode in other places (like strings you want to print out for the user or 
store in a database), and then you're in mojibake hell.

The io module, the pathlib backport, and six can help a bit (at the cost of 
performance and/or simplicity), but there's no easy answer--if there _were_ an 
easy answer, we wouldn't have Python 3 in the first place, right?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improving docs for len() of set

2016-02-08 Thread Andrew Barnert via Python-Dev
On Monday, February 8, 2016 8:23 AM, Ben Hoyt  wrote:


>Just a suggestion for a documentation tweak. Currently the docs for len() on a 
>set say this:

>
>   .. describe:: len(s)>
>  Return the cardinality of set *s*.
>
>I'm a relatively seasoned programmer, but I don't really have a maths 
>background, and I didn't know what "cardinality" meant. I could kind of grok 
>it by context, but could we change this to something like the following?
>
>   .. describe:: len(s)
>
>  Return the number of elements in set *s* (cardinality of *s*).


+{{}}

(using the normal von Neumann definitions for 0={} and Succ(n) = n U {n})
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-05 Thread Andrew Barnert via Python-Dev
On Friday, February 5, 2016 11:57 AM, Emile van Sebille  wrote:



> Aah, 'must' is less restrictive in this context than I expected. When 
> you combine the two halves the first part might be more accurately 
> phrased as 'The program must make source code available' rather than 
> 'must include' which I understood to mean 'ship with'.

First, step back and think of this in common sense terms: If being open source 
required any Python installation to have the .py source to the .pyc or .zip 
files in the stdlib, surely it would also require any Python installation to 
have the .c source to the interpreter too. But lots of people have Python 
without having the .c source.

Also, the GPL isn't typical of all open source licenses, it's only typical of 
_copyleft_ licenses. Permissive licenses, like Python's, are very different. 
Copyleft licenses are designed to make sure that all derived works are also 
copylefted; permissive licenses are designed to permit derived works as widely 
as possible. As the Python license specifically says, "All Python licenses, 
unlike the GPL, let you distribute a modified version without making your 
changes open source."

Meanwhile, the fact that someone has decided that the Python license qualifies 
under the Open Source Definition doesn't mean the OSD is the right way to 
understand it. Read the license itself, or one of the summaries at 
opensource.org or fsf.org. (And if you still can't figure something out, and 
it's important to your work, you almost certainly need to ask a lawyer.) So, if 
you think the first sentence of section 2 of the OSD contradicts the 
explanation in the rest of the paragraph--well, even if you're right, that 
doesn't affect Python's license at all.

Finally, if you want to see what it takes to actually make all the terms 
unambiguous both to ordinary human beings and to legal codes, see the GPL FAQ 
sections on their definitions of "propagate" and "convey". It may take you lots 
of careful reading to understand it, but when you finally do, it's definitely 
unambiguous.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Andrew Barnert via Python-Dev
On Feb 1, 2016, at 19:44, Terry Reedy  wrote:
> 
>> On 2/1/2016 3:39 PM, Andrew Barnert via Python-Dev wrote:
>> 
>> There are already multiple duplicate questions every month on
>> StackOverflow from people asking "how do I find the source to stdlib
>> module X". The canonical answer starts off by explaining how to
>> import the module and use its __file__, which everyone is able to
>> handle.
> 
> Perhaps even easier: start IDLE, hit Alt-M, type in module name as one would 
> import it, click OK.  If Python source is available, IDLE will open in an 
> editor window. with the path on the title bar.
> 
>> If we have to instead explain how to work out the .py name
>> from the qualified module name, how to work out the stdlib path from
>> sys.path, and then how to find the source from those two things, with
>> the caveat that it may not be installed at all on some platforms, and
>> how to make sure what they're asking about really is a stdlib module,
>> and how to make sure they aren't shadowing it with a module elsewhere
>> on sys.path, that's a lot more complicated.
> 
> The windows has the path on the title bar, so one can tell what was loaded.

The point of this thread is the suggestion that the stdlib modules be frozen or 
stored in a zipfile, unless a user modifies things in some way to make the 
source accessible. So, if a user hasn't done that (which no novice will know 
how to do), there won't be a path to show in the title bar, so IDLE won't be 
any more help than the command line.

(I suppose IDLE could grow a new feature to look up "associated source files" 
for a zipped stdlib or something, but that seems like a pretty big new feature.)

> IDLE currently uses imp.find_module (this could be updated), with a backup of 
> __import__(...).__file__, so it will load non-stdlib files that can be 
> imported.
> 
> > Finally, on Linux and Mac, the stdlib will usually be somewhere
> > that's not user-writable
> 
> On Windows, this depends on the install location.  Perhaps there should be an 
> option for edit-save or view only to avoid accidental changes.

The problem is that, if the standard way for users to see stdlib sources is to 
copy them from somewhere else (like $install/src/Lib) into a stdlib directory 
(like $install/Lib), then that stdlib directory has to be writable--and on Mac 
and Linux, it's not.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More optimisation ideas

2016-02-01 Thread Andrew Barnert via Python-Dev
On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote:
> 
>  If the stdlib were to use implicit namespace packages
> ( https://www.python.org/dev/peps/pep-0420/ ) and the various
> loaders/importers as well, then python could do what I've done with an
> embedded python application for years.  Freeze the stdlib (or put it
> in a zipfile or whatever is fast).  Then arrange PYTHONPATH to first
> look on the filesystem and then look in the frozen/ziped storage.

This is a great solution for experienced developers, but I think it would be 
pretty bad for novices or transplants from other languages (maybe even 
including Python 2).

There are already multiple duplicate questions every month on StackOverflow 
from people asking "how do I find the source to stdlib module X". The canonical 
answer starts off by explaining how to import the module and use its __file__, 
which everyone is able to handle. If we have to instead explain how to work out 
the .py name from the qualified module name, how to work out the stdlib path 
from sys.path, and then how to find the source from those two things, with the 
caveat that it may not be installed at all on some platforms, and how to make 
sure what they're asking about really is a stdlib module, and how to make sure 
they aren't shadowing it with a module elsewhere on sys.path, that's a lot more 
complicated. Especially when you consider that some people on Windows and Mac 
are writing Python scripts without ever learning how to use the terminal or 
find their Python packages via Explorer/Finder. 

And meanwhile, other people would be asking why their app runs slower on one 
machine than another, because they didn't expect that installing python-dev on 
top of python would slow down startup.

Finally, on Linux and Mac, the stdlib will usually be somewhere that's not 
user-writable--and we shouldn't expect users to have to mess with stuff in 
/usr/lib or /System/Library even if they do have sudo access. Of course we 
could put a "stdlib shadow" location on the sys.path and configure it for 
/usr/local/lib and /Library and/or for somewhere in -, but that just makes the 
lookup proceed even more complicated--not to mention that we've just added 
three stat calls to remove one open, at which point the optimization has 
probably become a pessimization.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Opcode cache in ceval loop

2016-02-01 Thread Andrew Barnert via Python-Dev
Looking over the thread and the two issues, you've got good arguments for why 
the improved code will be the most common code, and good benchmarks for various 
kinds of real-life code, but it doesn't seem like you'd tried to stress it on 
anything that could be made worse. From your explanations and your code, I 
wouldn't expect that @classmethods, functions stored in the object dict or 
generated by __getattr__, non-function callables as methods, etc. would go 
significantly slower, or code that mixes @properties or __getattr__ proxy 
attributes with real attributes, or uses __slots__, or code that does 
frequently write to a global, etc. But it would be nice to _know_ that they 
don't instead of just expecting it.

Sent from my iPhone

> On Feb 1, 2016, at 11:10, Yury Selivanov  wrote:
> 
> Hi,
> 
> This is the second email thread I start regarding implementing an opcode 
> cache in ceval loop.  Since my first post on this topic:
> 
> - I've implemented another optimization (LOAD_ATTR);
> 
> - I've added detailed statistics mode so that I can "see" how the cache 
> performs and tune it;
> 
> - some macro benchmarks are now 10-20% faster; 2to3 (a real application) is 
> 7-8% faster;
> 
> - and I have some good insights on the memory footprint.
> 
> ** The purpose of this email is to get a general approval from python-dev, so 
> that I can start polishing the patches and getting them reviewed/committed. **
> 
> 
> Summary of optimizations
> 
> 
> When a code object is executed more than ~1000 times, it's considered "hot".  
> It gets its opcodes analyzed to initialize caches for LOAD_METHOD (a new 
> opcode I propose to add in [1]), LOAD_ATTR, and LOAD_GLOBAL.
> 
> It's important to only optimize code objects that were executed "enough" 
> times, to avoid optimizing code objects for modules, classes, and functions 
> that were imported but never used.
> 
> The cache struct is defined in code.h [2], and is 32 bytes long. When a code 
> object becomes hot, it gets an cache offset table allocated for it (+1 byte 
> for each opcode) + an array of cache structs.
> 
> To measure the max/average memory impact, I tuned my code to optimize *every* 
> code object on *first* run.  Then I ran the entire Python test suite.  Python 
> test suite + standard library both contain around 72395 code objects, which 
> required 20Mb of memory for caches.  The test process consumed around 400Mb 
> of memory.  Thus, the absolute worst case scenario, the overhead is about 5%.
> 
> Then I ran the test suite without any modifications to the patch. This means 
> that only code objects that are called frequently enough are optimized.  In 
> this more, only 2072 code objects were optimized, using less than 1Mb of 
> memory for the cache.
> 
> 
> LOAD_ATTR
> -
> 
> Damien George mentioned that they optimize a lot of dict lookups in 
> MicroPython by memorizing last key/value offset in the dict object, thus 
> eliminating lots of hash lookups.  I've implemented this optimization in my 
> patch.  The results are quite good.  A simple micro-benchmark [3] shows ~30% 
> speed improvement.  Here are some debug stats generated by 2to3 benchmark:
> 
> -- Opcode cache LOAD_ATTR hits = 14778415 (83%)
> -- Opcode cache LOAD_ATTR misses   = 750 (0%)
> -- Opcode cache LOAD_ATTR opts = 282
> -- Opcode cache LOAD_ATTR deopts   = 60
> -- Opcode cache LOAD_ATTR total= 1912
> 
> Each "hit" makes LOAD_ATTR about 30% faster.
> 
> 
> LOAD_GLOBAL
> ---
> 
> This turned out to be a very stable optimization.  Here is the debug output 
> of the 2to3 test:
> 
> -- Opcode cache LOAD_GLOBAL hits   = 3940647 (100%)
> -- Opcode cache LOAD_GLOBAL misses = 0 (0%)
> -- Opcode cache LOAD_GLOBAL opts   = 252
> 
> All benchmarks (and real code) have stats like that.  Globals and builtins 
> are very rarely modified, so the cache works really well.  With LOAD_GLOBAL 
> opcode cache, global lookup is very cheap, there is no hash lookup for it at 
> all.  It makes optimizations like "def foo(len=len)" obsolete.
> 
> 
> LOAD_METHOD
> ---
> 
> This is a new opcode I propose to add in [1].  The idea is to substitute 
> LOAD_ATTR with it, and avoid instantiation of BoundMethod objects.
> 
> With the cache, we can store a reference to the method descriptor (I use 
> type->tp_version_tag for cache invalidation, the same thing _PyType_Lookup is 
> built around).
> 
> The cache makes LOAD_METHOD really efficient.  A simple micro-benchmark like 
> [4], shows that with the cache and LOAD_METHOD, "s.startswith('abc')" becomes 
> as efficient as "s[:3] == 'abc'".
> 
> LOAD_METHOD/CALL_FUNCTION without cache is about 20% faster than 
> LOAD_ATTR/CALL_FUNCTION.  With the cache, it's about 30% faster.
> 
> Here's the debug output of the 2to3 benchmark:
> 
> -- Opcode cache LOAD_METHOD hits   = 5164848 (64%)
> -- Opcode cache LOAD_METHOD misses = 12 (0%)
> -- Opcode cache LOAD_METHOD opts   = 94
> -- Opcode cache LOAD_METHOD d

Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread Andrew Barnert via Python-Dev
On Jan 25, 2016, at 19:32, INADA Naoki  wrote:
> 
>> On Tue, Jan 26, 2016 at 12:02 PM, Andrew Barnert  wrote:
>> On Jan 25, 2016, at 18:21, INADA Naoki  wrote:
>> >
>> > I'm very interested in it.
>> >
>> > Ruby 2.2 and PHP 7 are faster than Python 2.
>> > Python 3 is slower than Python 2.
>> 
>> Says who?
> 
> For example, http://benchmarksgame.alioth.debian.org/u64q/php.html
> In Japanese, many people compares language performance by microbench like 
> fibbonacci.

"In Japan, the hand is sharper than a knife [man splits board with karate 
chop], but the same doesn't work with a tomato [man splatters tomato all over 
himself with karate chop]."

A cheap knife really is better than a karate master at chopping tomatoes. And 
Python 2 really is better than Python 3 at doing integer arithmetic on the edge 
of what can fit into a machine word. But so what? Without seeing any of your 
Japanese web code, much less running a profiler, I'm willing to bet that your 
code is rarely CPU-bound, and, when it is, it spends a lot more time doing 
things like processing Unicode strings that are almost always UCS-2 (about 110% 
slower on Python 2) than doing this kind of arithmetic (9% faster on Python 2), 
or cutting tomatoes (TypeError on both versions).

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread Andrew Barnert via Python-Dev
On Jan 25, 2016, at 18:21, INADA Naoki  wrote:
> 
> I'm very interested in it.
> 
> Ruby 2.2 and PHP 7 are faster than Python 2.
> Python 3 is slower than Python 2.

Says who?

That was certainly true in the 3.2 days, but nowadays, most things that differ 
seem to be faster in 3.x. Maybe it's just the kinds of programs I write, but 
speedup in decoding UTF-8 that's usually ASCII (and then processing the decoded 
unicode when it's usually 1/4th the size), faster listcomps, and faster 
datetime seem to matter more than slower logging or slower imports. And that's 
just when running the same code; when you actually use new features, yield from 
is much faster than looping over yield; scandir blows away listdir; asyncio 
blows away asyncore or threading even harder; etc.

Maybe if you do different things, you have a different experience. But if you 
have a specific problem, you'd do a lot better to file specific bugs for that 
problem than to just hope that everything magically gets so much faster that 
your bottleneck no longer matters.

> Performance is a attractive feature.  Python 3 lacks it.

When performance matters, people don't use Python 2, Ruby, or PHP, any more 
than they use Python 3. Or, rather, they use _any_ of those languages for the 
95% of their code that doesn't matter, and C (often through existing libraries 
like NumPy--and try to find a good equivalent of that for Ruby or PHP) for the 
5% that does.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread Andrew Barnert via Python-Dev
On Jan 25, 2016, at 14:46, Victor Stinner  wrote:
> 
> You can design an AST optimizer to compile some functions to C and
> then register them as specialized code at runtime. I have a side
> project to use Cython and/or pythran to specialize some functions
> using type annotation on parameters.

That last part is exactly what I was thinking of. One way in which cythonizing 
your code isn't 100% compatible is that if you, say, shadow or replace int or 
range, the cython code is now wrong. Which is exactly the kind of thing FAT can 
guard against. Which is very cool. Glad to see you already thought of that 
before me. :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FAT Python (lack of) performance

2016-01-25 Thread Andrew Barnert via Python-Dev
On Jan 25, 2016, at 13:43, Victor Stinner  wrote:
> 
> According to microbenchmarks, the most promising optimizations are
> functions inlining (Python function calls are slow :-/) and specialize
> the code for the type of arguments.

Can you specialize a function with a C API function, or only with bytecode? I'm 
not sure how much benefit you'd get out of specializing list vs. generic 
iterable or int vs. whatever from an AST transform, but substituting raw C 
code, on the other hand...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Andrew Barnert via Python-Dev
On Wednesday, January 20, 2016 4:10 PM, Brett Cannon  wrote:


>I think Glenn was assuming we had a single, global version # that all dicts 
>shared without having a per-dict version ID. The key thing here is that we 
>have a global counter that tracks the number of mutations for all dictionaries 
>but whose value we store as a per-dictionary value. That ends up making the 
>version ID inherently both a token representing the state of any dict but also 
>the uniqueness of the dict since no two dictionaries will ever have the same 
>version ID.

This idea worries me. I'm not sure why, but I think because of threading. After 
all, it's pretty rare for two threads to both want to work on the same dict, 
but very, very common for two threads to both want to work on _any_ dict. So, 
imagine someone manages to remove the GIL from CPython by using STM: now most 
transactions are bumping that global counter, meaning most transactions fail 
and have to be retried, so you end up with 8 cores each running at 1/64th the 
speed of a single core but burning 100% CPU. Obviously a real-life 
implementation wouldn't be _that_ stupid; you'd special-case the 
version-bumping (maybe unconditionally bump it N times before starting the 
transaction, and then as long as you don't bump more than N times during the 
transaction, you can commit without touching it), but there's still going to be 
a lot of contention.

And that also affects something like PyPy being able to use FAT-Python-style 
AoT optimizations via cpyext. At first glance that sounds like a stupid 
idea--why would you want to run an optimizer through a slow emulator? But the 
optimizer only runs once and transforms the function code, which runs a zillion 
times, so who cares how slow the optimizer is? Of course it may still be true 
that many of the AoT optimizations that FAT makes don't apply very well to 
PyPy, in which case it doesn't matter. But I don't think we can assume that a 
priori.

Is there a way to define this loosely enough so that the implementation _can_ 
be a single global counter, if that turns out to be most efficient, but can 
also be a counter per dictionary and a globally-unique ID per dictionary?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Code formatter bot

2016-01-20 Thread Andrew Barnert via Python-Dev
On Jan 20, 2016, at 00:35, Ben Finney  wrote:
> 
> francismb  writes:
> 
>> what's your opinion about a code-formatter bot for cpython.
> 
> What is the proposal? The opinions will surely depend on:

... plus:

* How does the formatter bot deal with "legacy code"? Large parts of CPython 
predate PEPs 7 and 8, and the decision was made long ago not to reformat 
existing code unless that code is being substantially modified for some other 
reason. Similarly, when the PEPs are updated, the usual decision is to not 
reformat old code.

* When code _is_ auto-reformatted, what tools do you have to help git's merge 
logic, Reitveld, human readers looking at diffs or blame/annotate locally or on 
the web, etc. look past the hundreds of trivial changes to highlight the ones 
that matter?

* What's the argument for specifically automating code formatting instead of 
any of the other things a commit-triggered linter can catch just as easily?

But one comment on Ben's comment:

>  * If on the other hand you propose to enforce only those rules which
>are strict enough to be applied automatically (e.g. “don't mix
>spaces and TABs”, “encode source using UTF-8 only”) then that's best
>done by editor plug-ins like EditorConfig[0].

In my experience (although mostly with projects with a lot fewer contributors 
than CPython...), it can be helpful to have both suggested editor plugins that 
do the auto formatting on the dev's computer, and VCS-triggered checkers that 
ensure the formatting was correct. That catches those occasional cases where 
you do a quick "trivial" edit in nano instead of your usual editor and then 
forget you did so and try to check in), without the nasty side-effects you 
mention later (like committing code you haven't seen).

(Of course writing plugins that understand "legacy code" in the exact same way 
as the commit filter can be tricky, but in that case, it's better to know that 
one or the other isn't working as intended--both so a human can decide, and so 
people can see the bug in the plugin or filter--than to automatically make 
changes that weren't wanted.)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Update PEP 7 to require curly braces in C

2016-01-19 Thread Andrew Barnert via Python-Dev
> On Jan 19, 2016, at 08:56, Jim J. Jewett  wrote:
> 
> On Mon Jan 18 03:39:42 EST 2016, Andrew Barnert pointed out:
>> 
>> Alternatively, it could say something like "braces must not be omitted;
>> when other C styles would use a braceless one-liner, a one-liner with
>> braces should be used instead; otherwise, they should be formatted as 
>> follows"
> 
> That "otherwise" gets a bit awkward, but I like the idea.  Perhaps
> "braces must not be omitted, and should normally be formatted as
> follows. ... Where other C styles would permit a braceless one-liner,
> the expression and braces may be moved to a single line, as follows: "
> 
>if (x > 5) { y++ }
> 
> I think that is clearly better, but it may be *too* lightweight for
> flow control.
> 
>if (!obj)
>{ return -1; }
> 
> does work for me, and I think the \n{} may actually be useful for
> warning that flow control takes a jump.

Your wording is much better than mine. And so is your suggestion. Giving people 
the option of 1 or 3 lines, but not 2, seems a little silly. And, while I 
rarely use or see your 2-line version in C, I use it quite a bit in C++ (and 
related languages like D), so it doesn't look at all weird to me. But I'll 
leave it up to people who only do C (and Python) and/or who are more familiar 
with the CPython code base to judge.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Update PEP 7 to require curly braces in C

2016-01-18 Thread Andrew Barnert via Python-Dev
On Jan 17, 2016, at 11:10, Brett Cannon  wrote:
> 
> While doing a review of http://bugs.python.org/review/26129/ I asked to have 
> curly braces put around all `if` statement bodies. Serhiy pointed out that 
> PEP 7 says curly braces are optional: 
> https://www.python.org/dev/peps/pep-0007/#id5. I would like to change that.
> 
> My argument is to require them to prevent bugs like the one Apple made with 
> OpenSSL about two years ago: 
> https://www.imperialviolet.org/2014/02/22/applebug.html. Skipping the curly 
> braces is purely an aesthetic thing while leaving them out can lead to actual 
> bugs.
> 
> Anyone object if I update PEP 7 to remove the optionality of curly braces in 
> PEP 7?

There are two ways you could do that.

The first is to just change "braces may be omitted where C permits, but when 
present, they should be formatted as follows" to something like "braces must 
not be omitted, and should be formatted as follows", changing one-liner tests 
into this:

if (!obj) {
return -1;
}

Alternatively, it could say something like "braces must not be omitted; when 
other C styles would use a braceless one-liner, a one-liner with braces should 
be used instead; otherwise, they should be formatted as follows", changing the 
same tests into:

if (!obj) { return -1; }

The first one is obviously a much bigger change in the formatting of actual 
code, even if it's a simpler change to the PEP. Is that what was intended?___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should inspect.getargspec take any callable?

2016-01-16 Thread Andrew Barnert via Python-Dev
On Jan 16, 2016, at 08:05, Aviv Cohn via Python-Dev  
wrote:
> 
> The `getargspec` function in the `inspect` module enforces the input 
> parameter to be either a method or a function.

The `getargspec` already works with classes, callable objects, and some 
builtins.

It's also deprecated, in part because its API can't handle various features 
(like keyword-only arguments). There is an extended version that can handle 
some of those features, but as of 3.5 that one is deprecated as well.

The `signature` function is much easier to use, as well as being more powerful.

> 
> def getargspec(func):
> """Get the names and default values of a function's arguments.
> 
> A tuple of four things is returned: (args, varargs, varkw, defaults).
> 'args' is a list of the argument names (it may contain nested lists).
> 'varargs' and 'varkw' are the names of the * and ** arguments or None.
> 'defaults' is an n-tuple of the default values of the last n 
> arguments.
> """
> 
> if ismethod(func):
> func = func.im_func
> if not isfunction(func):
> raise TypeError('{!r} is not a Python function'.format(func))
> args, varargs, varkw = getargs(func.func_code)
> return ArgSpec(args, varargs, varkw, func.func_defaults)
> 
> Passing in a callable which is not a function causes a TypeError to be raised.
> 
> I think in this case any callable should be allowed, allowing classes and 
> callable objects as well.
> We can switch on whether `func` is a function, a class or a callable object, 
> and pass into `getargs` the appropriate value.
> 
> What is your opinion?
> Thank you
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/abarnert%40yahoo.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Discussion related to memory leaks requested

2016-01-13 Thread Andrew Barnert via Python-Dev
On Jan 13, 2016, at 14:49, Matthew Paulson  wrote:
> 
> Hi Victor:
> 
> No, I'm using the new heap analysis functions in DS2015. 

Isn't that going to report any memory that Python's higher level allocators 
hold in their freelists as leaked, even though it isn't leaked?

> We think we have found one issue. In the following sequence, dict has no side 
> effects, yet it is used -- unless someone can shed light on why dict is used 
> in this case:

Where do you see an issue here? The dict will have one ref, so the decref at 
the end should return it to the freelist.

Also, it looks like there _is_ a side effect here. When you add a bunch of 
elements to a dict, it grows. When you delete a bunch of elements, it generally 
doesn't shrink. But when you clear the dict, it does shrink. So, copying it to 
a temporary dict, clearing it, updating it from the temporary dict, and then 
releasing the temporary dict should force it to shrink.

So, the overall effect should be that you have a smaller hash table for the 
builtins dict, and a chunk of memory sitting on the freelists ready to be 
reused. If your analyzer is showing the freelists as leaked, this will look 
like a net leak rather than a net recovery, but that's just a problem in the 
analyzer.

Of course I could be wrong, but I think the first step is to rule out the 
possibility that you're measuring the wrong thing...

> /* Clear the modules dict. */
> PyDict_Clear(modules);
> /* Restore the original builtins dict, to ensure that any
>user data gets cleared. */
> dict = PyDict_Copy(interp->builtins);
> if (dict == NULL)
> PyErr_Clear();
> PyDict_Clear(interp->builtins);
> if (PyDict_Update(interp->builtins, interp->builtins_copy))
> PyErr_Clear();
> Py_XDECREF(dict);
> 
> And removing dict from this sequence seems to have fixed one of the issues, 
> yielding 14k per iteration.

> Simple program: Good idea.  We will try that -- right now it's embedded in a 
> more complex environment, but we have tried to strip it down to a very simple 
> sequence.
> 
> The next item on our list is memory that is not getting freed after running 
> simple string.  It's in the parsertok sequence -- it seems that the syntax 
> tree is not getting cleared -- but this opinion is preliminary.
> 
> Best,
> 
> Matt
> 
>> On 1/13/2016 5:10 PM, Victor Stinner wrote:
>> Hi,
>> 
>> 2016-01-13 20:32 GMT+01:00 Matthew Paulson :
>>> I've spent some time performing memory leak analysis while using Python in 
>>> an embedded configuration.
>> Hum, did you try tracemalloc?
>> 
>> https://docs.python.org/dev/library/tracemalloc.html
>> https://pytracemalloc.readthedocs.org/
>> 
>>> Is there someone in the group that would like to discuss this topic.  There 
>>> seems to be other leaks as well.  I'm new to Python-dev, but willing to 
>>> help or work with someone who is more familiar with these areas than I.
>> Are you able to reproduce the leak with a simple program?
>> 
>> Victor
>> 
>> 
> 
> -- 
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/abarnert%40yahoo.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-11 Thread Andrew Barnert via Python-Dev
On Jan 11, 2016, at 15:24, Victor Stinner  wrote:
> 
> 2016-01-12 0:07 GMT+01:00 Gregory P. Smith :
>>> Changes
>>> ===
>>> 
>>> (...)
>> 
>> Please be more explicit about what tests you are performing on the values.
>> setitem's "if the value is different" really should mean "if value is not
>> dict['key']".  similarly for update, there should never be equality checks
>> performed on the values.  just an "is" test of it they are the same object
>> or not.
> 
> Ok, done. By the way, it's also explained below: values are compared
> by their identify, not by their content.
> 
> For best dict efficiency, we can not implement this micro-optimization
> (to avoid a potential branch misprediction in the CPU) and always
> increase the version. But for guards, the micro-optimization can avoid
> a lot of dictionary lookups, especially when a guard watches for a
> large number of keys.

Are you saying that d[key] = d[key] may or may not increment the version, so 
any optimizer can't rely on the fact that it doesn't?

If so, that seems reasonable. (The worst case in incrementing the version 
unnecessarily is that you miss an optimization that would have been safe, 
right?).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bitwise operations for bytes and bytearray

2016-01-09 Thread Andrew Barnert via Python-Dev
On Jan 9, 2016, at 16:17, Blake Griffith  wrote:
> 
> A little update, I got ^, &, and | working for bytearrays. You can view the 
> diff here:
> https://github.com/python/cpython/compare/master...cowlicks:bitwise-bytes?expand=1

If you upload the diff to the issue on the tracker, the reitveld code review 
app should be able to pick it up automatically, allowing people to comment on 
it inline, in a much nicer format than a mailing list thread. It's especially 
nice if you're adding things in stages--people who have been following along 
can just look at the changes between patch 3 and 4, while new people can look 
at all the changes in one go, etc.

> How does it look? 
> Joe, is this how I should allocate the arrays? Am I freeing them properly?
> Am I checking the input enough?
> 
> After some feedback, I'll probably add bitshifting and negation. Then work on 
> bytes objects.
> 
> Does this warrant a pep?

Personally, I'd just make the case for the feature on the tracker issue. If one 
of the core devs thinks it needs a PEP, or further discussion on this list or 
-ideas, they'll say so there.

At present, it seems like there's not much support for the idea, but I think 
that's at least partly because people want to see realistic use cases (that 
aren't server better by the existing bitarray/bitstring/etc. modules on PyPI, 
or using a NumPy array, or just using ints, etc.).

>> On Fri, Jan 8, 2016 at 2:08 AM, Cameron Simpson  wrote:
>>> On 07Jan2016 16:12, Python-Dev  wrote:
>>> On Jan 7, 2016, at 15:57, Martin Panter  wrote:
> On 7 January 2016 at 22:26, Blake Griffith  
> wrote:
> I'm interested in adding the functionality to do something like:
 b'a' ^ b'b'
> b'\x03'
> Instead of the good ol' TypeError.
> 
> I think both bytes and bytearray should support all the bitwise 
> operations.
 
 There is a bug open about adding this kind of functionality:
 .
>>> 
>>> And it's in the needs patch stage, which makes it perfect for the OP: in 
>>> addition to learning how to hack on builtin types, he can also learn the 
>>> other parts of the dev process. (Even if the bug is eventually rejected, as 
>>> seems likely given that it sat around for three years with no compelling 
>>> use case  and then Guido added a "very skeptical" comment.)
>> 
>> The use case which springs immediately to my mind is cryptography. To 
>> encrypt a stream symmetrically you can go:
>> 
>>  cleartext-bytes ^ cryptographicly-random-bytes-from-cipher
>> 
>> so with this one could write:
>> 
>>  def crypted(byteses, crypto_source):
>>''' Accept an iterable source of bytes objects and a preprimed source of  
>>   crypto bytes, yield encrypted versions of the bytes objects.
>>'''
>>for bs in byteses:
>>  cbs = crypto_source.next_bytes(len(bs))
>>  yield bs ^ cbs
>> 
>> Cheers,
>> Cameron Simpson 
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> https://mail.python.org/mailman/options/python-dev/blake.a.griffith%40gmail.com
> 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bitwise operations for bytes and bytearray

2016-01-07 Thread Andrew Barnert via Python-Dev
On Jan 7, 2016, at 15:57, Martin Panter  wrote:
> 
>> On 7 January 2016 at 22:26, Blake Griffith  
>> wrote:
>> I'm interested in adding the functionality to do something like:
>> 
> b'a' ^ b'b'
>> b'\x03'
>> 
>> 
>> Instead of the good ol' TypeError.
>> 
>> I think both bytes and bytearray should support all the bitwise operations.
> 
> There is a bug open about adding this kind of functionality:
> .

And it's in the needs patch stage, which makes it perfect for the OP: in 
addition to learning how to hack on builtin types, he can also learn the other 
parts of the dev process. (Even if the bug is eventually rejected, as seems 
likely given that it sat around for three years with no compelling use case  
and then Guido added a "very skeptical" comment.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 257 and __init__

2015-12-29 Thread Andrew Barnert via Python-Dev
On Dec 29, 2015, at 13:03, Facundo Batista  wrote:
> 
>> On Tue, Dec 29, 2015 at 4:38 PM, Andrew Barnert  wrote:

>> I usually just don't bother. You can violate PEP 257 when it makes sense, 
>> just like PEP 8. They're just guidelines, not iron-clad rules.
> 
> Yeap, but pep257 (the tool [0]) complains for __init__, and wanted to
> know how serious was it.

Of course. It's telling you that you're not following the standard, which is 
correct. It's also expected in this case, and if you think you have a good 
reason for breaking from the standard, that's perfectly fine. You probably want 
to configure the tool to meet your own standards. (I've worked on multiple 
projects that used custom pep8 configurations. I haven't used pep257 as much, 
but I believe I've seen configurations for the slightly different conventions 
of scientific/numerical programming and Django programming, so presumably 
coming up with your own configuration shouldn't be too hard--don't require 
docstrings on __init__, or on all special methods, or only when there no 
__new__, or whatever.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 257 and __init__

2015-12-29 Thread Andrew Barnert via Python-Dev
On Dec 29, 2015, at 10:27, Facundo Batista  wrote:

> I was reading PEP 257 and it says that all public methods from a class
> (including __init__) should have a docstring.
> 
> Why __init__?
> 
> It's behaviour is well defined (inits the instance), and the
> initialization parameters should be described in the class' docstring
> itself, right?

Isn't the same thing true for every special method? There are lots of classes 
where __add___ just says "a.__add__(b) = a + b" or (better following the PEP) 
"Return self + value." But, in the rare case where the semantics of "a + b" are 
a little tricky (think of "a / b" for pathlib.Path), where else could you put 
it but __add__?

Similarly, for most classes, there's only one of __init__ or __new__, and the 
construction/initialization semantics are simple enough to describe in one line 
of the class docstring--but when things are more complicated and need to be 
documented, where else would you put it?

Meanwhile, the useless one-liner docstrings for these methods aren't usually a 
problem except in trivial classes--and in trivial classes, I usually just don't 
bother. You can violate PEP 257 when it makes sense, just like PEP 8. They're 
just guidelines, not iron-clad rules.

Unless you're working on a project that insists that we must follow those 
guidelines, usually for some good reason like having lots of devs who are more 
experienced in other languages and whose instinctive "taste" isn't sufficiently 
Pythonic. And for that use case, keeping the rules as simple as possible is 
probably helpful. Better to have one wasted line in every file than to have an 
extra rule that all those JS developers have to remember when they're working 
in Python.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Change the repr for datetime.timedelta (was Re: Asynchronous context manager in a typical network server)

2015-12-21 Thread Andrew Barnert via Python-Dev
On Dec 21, 2015, at 14:07, Chris Barker  wrote:
> 
> and there are a LOT of next-to worthless docstrings in the stdlib -- it would 
> be nice to clean them all up.
> 
> Is there any reason not to, other than someone having to do the work?

Is this just a matter of _datetimemodule.c (and various other things in the 
stdlib) not being (completely) argclinicified? Or is there something hairy 
about this type (and various other things in the stdlib) that makes them still 
useless even with argclinic?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Asynchronous context manager in a typical network server

2015-12-18 Thread Andrew Barnert via Python-Dev
On Friday, December 18, 2015 1:09 PM, Guido van Rossum  wrote:


>I guess we could make the default arg to sleep() 1e9. Or make it None and 
>special-case it. I don't feel strongly about this -- I'm not sure how baffling 
>it would be to accidentally leave out the delay and find your code sleeps 
>forever rather than raising an error (since if you don't expect the infinite 
>default you may not expect this kind of behavior).

Yeah, that is a potential problem.

The traditional C solution is to just allow passing -1 to mean "forever",* 
ideally with a constant so you can just say "sleep(FOREVER)". Which, in Python 
terms, would presumably mean "asyncio.sleep(asyncio.forever)", and it could be 
a unique object or an enum value or something instead of actually being -1.

* Or at least "until this rolls over 31/32/63/64 bits", which is where you get 
those 49-day bugs from... but that wouldn't be an issue in Python

> But I do feel it's not important enough to add a new function or method.

Definitely agreed.
>However, I don't think "forever" and "until cancelled" are really the same 
>thing. "Forever" can only be interrupted by loop.stop(); "until cancelled" 
>requires indicating how to cancel it, and there the OP's approach is about the 
>best you can do. (Or you could use the Event class, but that's really just a 
>wrapper on top of a Future made to look more like threading.Event in its API.)


OK, I thought the OP's code looked pretty clear as written: he wants to wait 
until cancelled, so he waits on something that pretty clearly won't ever finish 
until he's cancelled. If that (or an Event or whatever) is the best way to 
spell this, then I can't really think of any good uses for sleep(forever).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Asynchronous context manager in a typical network server

2015-12-18 Thread Andrew Barnert via Python-Dev
On Dec 18, 2015, at 10:36, Guido van Rossum  wrote:
> 
>> On Fri, Dec 18, 2015 at 10:25 AM, Szieberth Ádám  wrote:
>> Thanks for your reply Guido!
>> 
>> > - In theory, instead of waiting for a Future that is cancelled by a
>> > handler, you should be able to use asyncio.sleep() with a very large number
>> > (e.g. a million seconds).
>> 
>> I was thinking on this too but it seemed less explicit to me than awaiting a
>> pure Future with a short comment. Moreover, even millions of seconds can 
>> pass.
> 
> 11 years.

It's 11 days. Which is pretty reasonable server uptime. And probably just 
outside the longest test you're ever going to run. I don't trust myself to pick 
"a big number" when the numbers get this big. But I still sometimes sneak one 
past myself somehow. Hence my suggestion for a way to actually say "forever".

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Asynchronous context manager in a typical network server

2015-12-18 Thread Andrew Barnert via Python-Dev
On Dec 18, 2015, at 10:25, Szieberth Ádám  wrote:
> 
>> - In theory, instead of waiting for a Future that is cancelled by a
>> handler, you should be able to use asyncio.sleep() with a very large number
>> (e.g. a million seconds).
> 
> I was thinking on this too but it seemed less explicit to me than awaiting a 
> pure Future with a short comment. Moreover, even millions of seconds can pass.

Yes, and these are really fun to debug. When a customer comes to you with "it 
was running fine for a few months and then suddenly it started going crazy, but 
I can't reproduce it", unless you happen to remember that you decided 10 
million seconds was "forever" and ask whether "a few months" specifically means 
a few days short of 4 months... (At least with 24 and 49 days I know to look 
for which library used a C integer for milliseconds.)

Really, I don't see anything wrong with the way the OP wrote it. Is that just 
because I have bad C habits (/* Useless select because there's no actual sleep 
function that allows SIGUSR to wake us without allowing all signals to wake us 
that works on both Solaris and IRIX */) and it really does look misleading to 
people who aren't warped like that?

If so, would it be worth having an actual way to say "sleep forever (until 
canceled)"? Even if, under the covers, this only sleeps for 5 years or so, 
a Y52K problem that can be solved by just pushing a new patch release for 
Python instead of for every separate server written in Python is probably a bit 
nicer. :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Idea: Dictionary references

2015-12-18 Thread Andrew Barnert via Python-Dev
> On Dec 18, 2015, at 04:56, Steven D'Aprano  wrote:
> 
>>> On Thu, Dec 17, 2015 at 09:30:24AM -0800, Andrew Barnert via Python-Dev 
>>> wrote:
>>> On Dec 17, 2015, at 07:38, Franklin? Lee  
>>> wrote:
>>> 
>>> The nested dictionaries are only for nested scopes (and inner
>>> functions don't create nested scopes). Nested scopes will already
>>> require multiple lookups in parents.
>> 
>> I think I understand what you're getting at here, but it's a really 
>> confusing use of terminology. In Python, and in programming in 
>> general, nested scopes refer to exactly inner functions (and classes) 
>> being lexically nested and doing lookup through outer scopes. The fact 
>> that this is optimized at compile time to FAST vs. CELL vs. 
>> GLOBAL/NAME, cells are optimized at function-creation time, and only 
>> global and name have to be resolved at the last second doesn't mean 
>> that there's no scoping, or some other form of scoping besides 
>> lexical. The actual semantics are LEGB, even if L vs. E vs. GB and E 
>> vs. further-out E can be optimized.
> 
> In Python 2, the LOAD_NAME byte-code can return a local, even though it 
> normally doesn't:
> 
> py> x = "global"
> py> def spam():
> ... exec "x = 'local'"
> ... print x
> ...
> py> spam()
> local
> py> x == 'global'
> True
> 
> 
> If we look at the byte-code, we see that the lookup is *not* optimized 
> to inspect locals only (LOAD_FAST), but uses the regular LOAD_NAME that 
> normally gets used for globals and builtins:
> 
> py> import dis
> py> dis.dis(spam)
>  2   0 LOAD_CONST   1 ("x = 'local'")
>  3 LOAD_CONST   0 (None)
>  6 DUP_TOP
>  7 EXEC_STMT
> 
>  3   8 LOAD_NAME0 (x)
> 11 PRINT_ITEM
> 12 PRINT_NEWLINE
> 13 LOAD_CONST   0 (None)
> 16 RETURN_VALUE
> 
> 
> 
>> What you're talking about here is global lookups falling back to 
>> builtin lookups. There's no more general notion of nesting or scoping 
>> involved, so why use those words?
> 
> I'm not quite sure about this. In principle, every name lookup looks in 
> four scopes, LEGB as you describe above:
> 
> - locals
> - non-locals, a.k.a. enclosing or lexical scope(s)
> - globals (i.e. the module)
> - builtins
> 
> 
> although Python can (usually?) optimise away some of those lookups.

I think it kind of _has_ to optimize away, or at least tweak, some of those 
things, rather than just acting as if globals and builtins were just two more 
enclosing scopes. For example, global to builtins has to go through 
globals()['__builtins__'], or act as if it does, or code that relies on, say, 
the documented behavior of exec can be broken. And you have to be able to 
modify the global scope after compile time and have that modification be 
effective, which means you'd have to allow the same things on locals and 
closures if they were to act the same.

> The 
> relationship of locals to enclosing scopes, and to globals in turn, 
> involve actual nesting of indented blocks in Python, but that's not 
> necessarily the case. One might imagine a hypothetical capability for 
> manipulating scopes directly, e.g.:
> 
> def spam(): ...
> def ham(): ...
> set_enclosing(ham, spam)
> # like:
> # def spam():
> # def ham(): ...

But that doesn't work; a closure has to link to a particular invocation of its 
outer function, not just to the function. Consider a trivial example:

def spam(): x=time()
def ham(): return x
set_enclosing(ham, spam)
ham()

There's no actual x value in scope. So you need something like this if you want 
to actually be able to call it:

def spam(helper):
x=time()
helper = bind_closure(helper, sys._getframe())
return helper()
def ham(): return x
set_enclosing(ham, spam)
spam(ham)

Of course you could make that getframe implicit; the point is there has to be a 
frame from an invocation of spam, not just the function itself, to make lexical 
scoping (errr... dynamically-generated fake-lexical scoping?) useful.

> The adventurous or fool-hardy can probably do something like that now 
> with byte-code hacking :-)

Yeah; I actually played with something like this a few years ago. I did it 
directly in terms of creating cell and free vars, not circumventing the 
existing LEGB system, which means you have to modify not just ham, but spam, in 
that set_enclosing. (In fact, you also have to modify all functions lexically 
or faux-l

Re: [Python-Dev] Idea: Dictionary references

2015-12-17 Thread Andrew Barnert via Python-Dev
On Dec 17, 2015, at 15:41, Franklin? Lee  wrote:
> 
> I already know that we can't use recursion, because it bypasses MRO.
> I'm also not yet sure whether it makes sense to use refs for classes
> in the first place.
> 
> As I understood it, an attribute will resolve in this order:
> - __getattribute__ up the MRO. (raises AttributeError)
> - __dict__ up the MRO. (raises KeyError)
> - __getattr__ up the MRO. (raises AttributeError)
> 
> 
> My new understanding:
> - __getattribute__. (raises AttributeError)
>- (default implementation:) __dict__.__getitem__. (raises KeyError)
> - __getattr__ up the MRO. (raises AttributeError)

No, still completely wrong.

If __getattribute__ raises an AttributeError (or isn't found, but that only 
happens in special cases like somehow calling a method on a type that hasn't 
been constructed), that's the end of the line; there's no fallback, and 
everything else (IIRC: searching MRO dicts for data descriptors, searching the 
instance dict, searching MRO dicts for non-data descriptors or non-descriptors, 
special-method-lookup-and-call __getattr__, raise AttributeError... and then 
doing the appropriate descriptor call at the end if needed).

I was going to say that the only custom __getattribute__ you'll find in 
builtins or stdlib is on type, which does the exact same thing except when it 
calls a descriptor it does __get__(None, cls) instead of __get__(obj, 
type(obj)), and if you find any third-party __getattribute__ you should just 
assume it's going to do something crazy and don't bother trying to help it. But 
then I remembered that super must have a custom __getattribute__, so... you'd 
probably need to search the code for others.

> If this is the case, then (the default) __getattribute__ will be
> making the repeated lookups, and might be the one requesting the
> refcells (for the ones it wants).

Yes, the default and type __getattribute__ are what you'd want to optimize, if 
anything. And maybe special-method lookup.

> Descriptors seem to be implemented as:
>Store a Descriptor object as an attribute. When a Descriptor is
> accessed, if it is being accessed from its owner, then unbox it and
> use its methods. Otherwise, it's a normal attribute.

Depending on what you mean by "owner", I think you have that backward. If the 
instance itself stores a descriptor, it's just used as itself; if the 
instance's _type_ (or a supertype) stores one, it's called to get the instance 
attribute.

> Then Descriptors are in the dict, so MIGHT benefit from refcells. The
> memory cost might be higher, though.

Same memory cost. They're just objects whose type's dicts happen to have a 
__get__ method (just like iterables are just objects whose type's dicts happen 
to have an __iter__ method). The point is that you can't cache the result of 
the descriptor call, you can cache the descriptor itself but it will rarely 
help, and the builtin method cache probably already takes care of 99% of the 
cases where it would help, so I don't see what you're going to get here.

>> On Thu, Dec 17, 2015 at 5:17 PM, Andrew Barnert  wrote:
>>> On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Dev 
>>>  wrote:
>>> 
>>> On Thursday, December 17, 2015 11:19 AM, Franklin? Lee 
>>>  wrote:
>>> 
>>> 
>>>> ...
>>>> as soon as I figure out how descriptors actually work...
>>> 
>>> 
>>> I think you need to learn what LOAD_ATTR and the machinery around it 
>>> actually does before I can explain why trying to optimize it like 
>>> globals-vs.-builtins doesn't make sense. Maybe someone who's better at 
>>> explaining than me can come up with something clearer than the existing 
>>> documentation, but I can't.
>> 
>> I take that back. First, it was harsher than I intended. Second, I think I 
>> can explain things.
> 
> I appreciate it! Tracking function definitions in the source can make
> one want to do something else.

The documentation is pretty good for this stuff (and getting better every 
year). You mainly want the data model chapter of the reference and the 
descriptor howto guide; the dis and inspect docs in the library can also be 
helpful. Together they'll answer most of what you need.

If they don't, maybe I will try to write up an explanation as a blog post, but 
I don't think it needs to get sent to the list (except for the benefit of core 
devs calling me out of I screw up, but they have better things to do with their 
time).

>> First, for non-attribute lookups:
>> 
>> (Non-shared) locals just load and save from an array.
>> 
>> Free variables and shared locals 

Re: [Python-Dev] Idea: Dictionary references

2015-12-17 Thread Andrew Barnert via Python-Dev
On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Dev 
 wrote:
> 
> On Thursday, December 17, 2015 11:19 AM, Franklin? Lee 
>  wrote:
> 
> 
>> ...
>> as soon as I figure out how descriptors actually work...
> 
> 
> I think you need to learn what LOAD_ATTR and the machinery around it actually 
> does before I can explain why trying to optimize it like globals-vs.-builtins 
> doesn't make sense. Maybe someone who's better at explaining than me can come 
> up with something clearer than the existing documentation, but I can't.

I take that back. First, it was harsher than I intended. Second, I think I can 
explain things.

First, for non-attribute lookups:

(Non-shared) locals just load and save from an array.

Free variables and shared locals load and save by going through an extra 
dereference on a cell object in an array.

Globals do a single dict lookup.

Builtins do two dict lookups.

So, the only thing you can optimize there is builtins. But maybe that's worth 
it.

Next, for attribute lookups (not counting special methods):

Everything calls __getattribute__. Assuming that's not overridden and uses the 
object implementation:

Instance attributes do one dict lookup.

Class attributes (including normal methods, @property, etc.) do two or more 
dict lookups--first the instance, then the class, then each class on the 
class's MRO. Then, if the result has a __get__ method, it's called with the 
instance and class to get the actual value. This is how bound methods get 
created, property lookup functions get called, etc. The result of the 
descriptor call can't get cached (that would mean, for example, that every time 
you access the same @property on an instance, you'd get the same value).

Dynamic attributes from a __getattr__ do all that plus whatever __getattr__ 
does.

If __getattribute__ is overloaded, it's entirely up to that implementation to 
do whatever it wants.

Things are similar for set and del: they call __setattr__/__delattr__, and the 
default versions of those look in the instance dict first, then look for a 
descriptor the same as with get except that they call a different method on the 
descriptor (and if it's not a descriptor, instead of using it, they ignore it 
and go back to the instance dict).

So, your mechanism can't significantly speed up method lookups, properties, or 
most other things. It could speed up lookups for class attributes that aren't 
descriptors, but only at the cost of increasing the size of every instance--and 
how often do those matter anyway?

A different mechanism that cached references to descriptors instead of to the 
resulting attributes could speed up method lookups, etc., but only by a very 
small amount, and with the same space cost.

A mechanism that didn't try to get involved with the instance dict, and just 
flattened out the MRO search once that failed (and was out of the way before 
the descriptor call or __getattr__ even entered the picture) might speed 
methods up in deeply nested hierarchies, and with only a per-class rather than 
a per-instance space cost. But how often do you have deeply-nested hierarchies? 
And the speedup still isn't going to be that big: You're basically turning 5 
dict lookups plus 2 method calls into 2 dict lookups plus 2 method calls. And 
it would still be much harder to guard than the globals dict: if any superclass 
changes its __bases__ or adds or removes a __getattribute__ or various other 
things, all of your references have to get re-computed. That's rare enough that 
the speed may not matter, but the code complexity probably does.

If short: if you can't cache the bound methods (and as far as I can tell, in 
general you can't--even though 99% of the time it would work), I don't think 
there's any other significant win here. 

So, if the globals->builtins optimization is worth doing, don't tie it to 
another optimization that's much more complicated and less useful like this, or 
we'll never get your simple and useful idea. 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Idea: Dictionary references

2015-12-17 Thread Andrew Barnert via Python-Dev
On Thursday, December 17, 2015 11:19 AM, Franklin? Lee 
 wrote:


> ...
> as soon as I figure out how descriptors actually work...


I think you need to learn what LOAD_ATTR and the machinery around it actually 
does before I can explain why trying to optimize it like globals-vs.-builtins 
doesn't make sense. Maybe someone who's better at explaining than me can come 
up with something clearer than the existing documentation, but I can't.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Idea: Dictionary references

2015-12-17 Thread Andrew Barnert via Python-Dev
On Dec 17, 2015, at 07:38, Franklin? Lee  wrote:
> 
> The nested dictionaries are only for nested scopes (and inner
> functions don't create nested scopes). Nested scopes will already
> require multiple lookups in parents.

I think I understand what you're getting at here, but it's a really confusing 
use of terminology. In Python, and in programming in general, nested scopes 
refer to exactly inner functions (and classes) being lexically nested and doing 
lookup through outer scopes. The fact that this is optimized at compile time to 
FAST vs. CELL vs. GLOBAL/NAME, cells are optimized at function-creation time, 
and only global and name have to be resolved at the last second doesn't mean 
that there's no scoping, or some other form of scoping besides lexical. The 
actual semantics are LEGB, even if L vs. E vs. GB and E vs. further-out E can 
be optimized.

What you're talking about here is global lookups falling back to builtin 
lookups. There's no more general notion of nesting or scoping involved, so why 
use those words?

Also, reading your earlier post, it sounds like you're trying to treat 
attribute lookup as a special case of global lookup, only with a chain of 
superclasses beyond the class instead of just a single builtins. But they're 
totally different. Class lookup doesn't just look in a series of dicts, it 
calls __getattribute__ which usually calls __getattr__ which may or may not 
look in the __dict__s (which may not even exist) to find a descriptor and then 
calls its __get__ method to get the value. You'd have to somehow handle the 
case where the search only went through object.__getattribute__ and __getattr__ 
and found a result by looking in a dict, to make a RefCell to that dict which 
is marked in some way that says "I'm not a value, I'm a descriptor you have to 
call each time", and then apply some guards that will detect whether that class 
or any intervening class dict touched that key, whether the MRO changed, 
whether that class or any intervening class added or changed implementations for
  __getatttibute__ or __getattr__, and probably more things I haven't thought 
of. What do those guards look like? (Also, you need a different set of rules to 
cache, and guard for, special method lookup--you could just ignore that, but I 
think those are the lookups that would benefit most from optimization.)

So, trying to generalize global vs. builtin to a general notion of "nested 
scope" that isn't necessary for builtins and doesn't work for anything else 
seems like overcomplicating things for no benefit.


> I think this is strictly an
> improvement, except perhaps in memory. Guards would also have an issue
> with nested scopes. You have a note on your website about it:
> (https://faster-cpython.readthedocs.org/fat_python.html#call-pure-builtins)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] async/await behavior on multiple calls

2015-12-16 Thread Andrew Barnert via Python-Dev
> On Dec 16, 2015, at 03:25, Paul Sokolovsky  wrote:
> 
> Hello,
> 
> On Tue, 15 Dec 2015 17:29:26 -0800
> Roy Williams  wrote:
> 
>> @Kevin correct, that's the point I'd like to discuss.  Most other
>> mainstream languages that implements async/await expose the
>> programming model with Tasks/Futures/Promises as opposed to
>> coroutines  PEP 492 states 'Objects with __await__ method are called
>> Future-like objects in the rest of this PEP.' but their behavior
>> differs from that of Futures in this core way.  Given that most other
>> languages have standardized around async returning a Future as
>> opposed to a coroutine I think it's worth exploring why Python
>> differs.
> 
> Sorry, but what makes you think that it's worth exploring why Python
> Python differs, and not why other languages differ?

They're really the same question.

Python differs from C# in that it builds async on top of language-level 
coroutines instead of hiding them under the hood, it only requires a simple 
event loop (which can be trivially built on a select-like function and a loop) 
rather than a powerful OS/VM-level task scheduler, it's designed to allow 
pluggable schedulers (maybe even multiple schedulers in one app), it doesn't 
have a static type system to assist it, ... Turn it around and ask how C# 
differs from Python and you get the same differences. And there's no value 
judgment either way.

So, do any of those explain why some Python awaitables aren't safely 
re-awaitable? Yes: the fact that Python uses language-level coroutines instead 
of hiding them under the covers means that it makes sense to be able to 
directly await coroutines (and to make async functions return those coroutines 
when called), which raises a question that doesn't exist in C#.

What happens when you await an already-consumed awaitables? That question 
doesn't arise in C# because it doesn't have consumable awaitables. Python 
_could_ just punt on that by not allowing coroutines to be awaitable, or 
auto-wrapping them, but that would be giving up a major positive benefit over 
C#. So, that means Python instead has to decide what happens.

In general, the semantics of awaiting an awaitable are that you get its value 
or an exception. Can you preserve those semantics even with raw coroutines as 
awaitables? Sure; as two people have pointed out in this thread, just make 
awaiting a consumed coroutine raise. Problem solved. But if nobody had asked 
about the differences between Python and C#, it would have been a lot harder to 
solve (or even see) the question.

> Also, what "most other languages" do you mean?

Well, what he said was "Most other mainstream languages that implements 
async/await". But you're right; clearly what he meant was just C#, because 
that's the only other mainstream language that implements async/await today. 
Others (JS, Scala) are implementing it or considering doing so, but, just like 
Python, they're borrowing it from C# anyway. (Unless you want to call F# async 
blocks and let! binding the same feature--but if so, C# borrowed from F# and 
everyone else borrowed from C#, so it's still the same.)

> Lua was a pioneer of
> coroutine usage in scripting languages, with research behind that.
> It doesn't have any "futures" or "promises" as part of the language.
> It has only coroutines. For niche cases when "futures" or "promises"
> needed, they can be implemented on top of coroutines.
> 
> And that's actually the problem with Python's asyncio - it tries to
> marry all the orthogonal concurrency concepts, unfortunately good
> deal o'mess ensues.

The fact that futures can be built on top of coroutines, or on top of promises 
and callbacks, means they're a way to tie together pieces of asynchronous code 
written in different styles. And the idea of a simple supertype of both futures 
and coroutines that's sufficient for a large set of problems, means you rarely 
need wrappers to transform one into the other; just use whichever one you have 
as an awaitable and it works.

So, you can write 80% of your code in terms of awaitables, but if the last 20% 
needs to get at the native coroutines, or to integrate with legacy code using 
callbacks, it's easy to do so. In C#, you instead have to simulate those 
coroutines with promises even when you're not integrating with legacy code; in 
a language without futures you'd have to wrap each call into and out of legacy 
code manually.

If you were designing a new language, you could probably get away with 
something a lot simpler. (If the only thing you could ever need a future for is 
to cache an awaitable value, it's a one-liner.) But for Python (and JS, Scala, 
C#, etc.) that isn't an option.

> It doesn't help on "PR" side too, because coroutine
> lovers blame it for not being based entirely on language's native
> coroutines, strangers from other languages want to twist it to be based
> entirely on foreign concepts like futures, Twisted haters hate that it
> has too much complication tak

Re: [Python-Dev] async/await behavior on multiple calls

2015-12-15 Thread Andrew Barnert via Python-Dev
On Dec 15, 2015, at 17:29, Roy Williams  wrote:
> 
> My proposal would be to automatically wrap the return value from an `async` 
> function or any object implementing `__await__` in a future with 
> `asyncio.ensure_future()`.  This would allow async/await code to behave in a 
> similar manner to other languages implementing async/await and would remain 
> compatible with existing code using asyncio.

Two questions:

Is it possible (and at all reasonable) to write code that actually depends on 
getting raw coroutines from async?

If not, is there any significant performance impact for code that works with 
raw coroutines and doesn't need real futures to get them wrapped in futures 
anyway?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A function for Find-Replace in lists

2015-12-09 Thread Andrew Barnert via Python-Dev
On Dec 9, 2015, at 03:43, טל ח  wrote:
> 
> Hi,
> 
> I think it could be helpful for everyone if the function proposed by user 
> "SomethingSomething" can be added as built-in
> in Python

Why? When he was asked what use it might have, he didn't have an answer.

Also, notice that the answer he provided doesn't actually do what he asked for; 
as he himself points out, it's different in at least two ways from his stated 
requirements. So, which one of the two do you want? And why is that one, rather 
than the other, useful?

Also, why would you call this list_replace? That sounds like a function that 
would replace elements with elements, not make a copy with elements replaced by 
new lists flattened into place.

Also, why would you only want this lists, rather than for any iterable? And 
what can it do that this more general and completely trivial function can't:

def flattening_subst(iterable, value, sequence):
for x in iterable:
if x == value: yield from sequence
else: yield x

If you know of another language whose standard library has an equivalent, that 
might narrow down exactly what the requirements are, point at an implementation 
that actually meets those requirements, and probably provide examples that hint 
at the point of having this function in the first place.


> See both question by "SomethingSomething" and answer to himself with 
> implementation..
> 
> http://stackoverflow.com/questions/34174643/python-find-replace-on-lists
> 
> 
> Thanks
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Language Reference has no mention of list comÃprehensions

2015-12-04 Thread Andrew Barnert via Python-Dev
On Dec 4, 2015, at 00:38, Nick Coghlan  wrote:
> 
> On 4 December 2015 at 12:48, Andrew Barnert via Python-Dev
>  wrote:
>> On Dec 3, 2015, at 17:25, Steven D'Aprano  wrote:
>>>> On Thu, Dec 03, 2015 at 09:25:53AM -0800, Andrew Barnert via Python-Dev 
>>>> wrote:
>>>> I've seen people saying that before, but I don't know where they get
>>>> that. It's certainly not the way, say, C++ or JavaScript use the term.
> 
> I'm one of the folks that use it that way, but I learned that
> terminology *from* the Python language reference.

If that's the usual case, then isn't it almost certainly more true for 
"display" than for "literal"? I doubt most Python users came in with a 
pre-existing notion of "display" from another language, or from programming in 
general--or, if they did, it's probably one of the senses that's irrelevant 
enough to not confuse anyone (like a repr, or a string formatting template). So 
if you want to redefine one of our terms to allow a new distinction, why not 
that one?

More importantly, as I said in my other message: do we actually need to be able 
to make this distinction? The problem this thread set out to solve is that 
"comprehension" doesn't have a docs section because it's just a subset of 
displays, so you can't search for it. Making it a subset of dynamic literals, 
which is a subset of literals, seems like it gets us farther from a solution. 
Right now, we could easily change the section title to "list displays 
(including comprehensions)" and we're done.

>>> I wouldn't take either of those two languages as examples of best
>>> practices in language design :-)
>> 
>> No, but they seem to be the languages (along with C and Java) that people 
>> usually appeal to.
>> 
>> You also found "literal" used the same way as JavaScript in Ruby, one of 
>> three languages in your quick survey. It's also used similarly in ML and 
>> Haskell. In Lisp, it has a completely different meaning (a quoted list).
>> 
>> But as I said before, we can't use the word "literal" to contrast with 
>> comprehensions, because a large segment of the Python community (including 
>> you) would find that use of the word confusing and/or annoying because you 
>> intuitively think of the C/FORTRAN/etc. definition rather than the 
>> C++/Ruby/JS/Haskell definition. It doesn't matter whether that's a peculiar 
>> quirk of the Python community or not, whether there's a good reason for it 
>> or not, etc.; all that matters is that it's true.
> 
> Even though it's true, I'm not sure it's sufficient to rule out a
> switch to "there are two kinds of literal" as the preferred
> terminology.
> 
> The recent case that comes to mind is the new format string literals -
> those can include arbitrary subexpressions, like container displays
> and comprehensions, but the conclusion from the PEP 498 discussion was
> that it makes the most sense to still consider them a kind of string
> literal.
> 
> There's also a relatively straightforward way of defining the key
> semantic different between a literal and a normal constructor call:
> with a literal, there's no way to override the type of the resulting
> object, while a constructor call can be monkeypatched like any other
> callable.

Is that an important distinction to anyone but people who write Python 
implementations? If some library I'm using chooses to monkeypatch or shadow a 
type name, the objects are still going to quack the way I expect (otherwise, 
I'm going to stop using that library pretty quickly). And meanwhile, why do I 
need to distinguish between libraries that monkeypatch the stdlib for me and 
libraries that install an import hook to patch my code?

It's certainly not meaningless or completely useless (e.g., the discussion 
about whether f-strings are literals would have been shorter, and had more of a 
point), but it doesn't seem useful enough to be worth redefining existing 
terminology.

> The distinction that arises for containers is then the one that Chris
> Angelico pointed out: a container literal may have constant content,
> *or* it may have dynamic content.

Well, yes, but, again, both forms of container literal can have dynamic 
content: [f(), g()] is just as dynamic as [x() for x in (f, g)]. So we still 
don't have the contrast we were looking for.

Also, [1, 2] is literal, and not dynamic, but it's not a constant value, so 
calling it a constant literal seems likely to be more confusing than helpful.

One more thing: we don't have to worry about whether def and class are literal

Re: [Python-Dev] Python Language Reference has no mention of list comÃprehensions

2015-12-03 Thread Andrew Barnert via Python-Dev
On Dec 3, 2015, at 17:25, Steven D'Aprano  wrote:
> 
> On Thu, Dec 03, 2015 at 09:25:53AM -0800, Andrew Barnert via Python-Dev wrote:
>>> On Dec 3, 2015, at 08:15, MRAB  wrote:
>>> 
>>>>> On 2015-12-03 15:09, Random832 wrote:
>>>>> On 2015-12-03, Laura Creighton  wrote:
>>>>> Who came up with the word 'display' and what does it have going for
>>>>> it that I have missed?  Right now I think its chief virtue is that
>>>>> it is a meaningless noun.  (But not meaningless enough, as I
>>>>> associate displays with output, not construction).
> 
> I completely agree with Laura here -- to me "display" means output, not 
> construction, no matter what the functional programming community says 
> :-) but I suppose the connection is that you can construct a list using
> the same syntax used to display that list: [1, 2, 3] say.
> 
> I don't think the term "display" will ever feel natural to me, but I 
> have got used to it.
> 
> 
> Random832 wrote:
> 
>>>> In a recent discussion it seemed like people mainly use it
>>>> because they don't like using "literal" for things other than
>>>> single token constants.  In most other languages' contexts the
>>>> equivalent thing would be called a literal.
> 
> I'm not sure where you get "most" other languages from. At the very 
> least, I'd want to see a language survey. I did a *very* fast one (an 
> entire three languages *wink*) and found these results:
> 
> The equivalent of a list [1, a, func(), x+y] is called:
> 
> "display" (Python)
> 
> "literal" (Ruby)
> 
> "constructor" (Lua)
> 
> http://ruby-doc.org/core-2.1.1/doc/syntax/literals_rdoc.html#label-Arrays
> http://www.lua.org/manual/5.1/manual.html
> 
> Of the three, I think Lua's terminology is least worst.
> 
> 
> MRAB:
>>> "Literals" also tend to be constants, or be constructed out of
>>> constants.
> 
> Andrew: 
>> I've seen people saying that before, but I don't know where they get 
>> that. It's certainly not the way, say, C++ or JavaScript use the term.
> 
> I wouldn't take either of those two languages as examples of best 
> practices in language design :-)

No, but they seem to be the languages (along with C and Java) that people 
usually appeal to.

You also found "literal" used the same way as JavaScript in Ruby, one of three 
languages in your quick survey. It's also used similarly in ML and Haskell. In 
Lisp, it has a completely different meaning (a quoted list). 

But as I said before, we can't use the word "literal" to contrast with 
comprehensions, because a large segment of the Python community (including you) 
would find that use of the word confusing and/or annoying because you 
intuitively think of the C/FORTRAN/etc. definition rather than the 
C++/Ruby/JS/Haskell definition. It doesn't matter whether that's a peculiar 
quirk of the Python community or not, whether there's a good reason for it or 
not, etc.; all that matters is that it's true.

> [...]
>>> A list comprehension can contain functions, etc.
>> 
>> A non-comprehension display can include function calls, lambdas, or 
>> any other kind of expression, just as easily as a comprehension can. 
>> Is [1, x, f(y), lambda z: w+z] a literal? If so, why isn't [i*x for i 
>> in y] a literal?
> 
> I wouldn't call either a literal.

My point was that if the reason comprehensions aren't literals but the other 
kind of displays are is that the former can contain functions and the latter 
can't, that reason is just wrong. Both can contain functions. The intuition 
MRAB was appealing to doesn't even match his intuition, much less a universal 
one. And my sentence that you quoted directly below directly follows from that:

>> The problem is that we need a word that distinguishes the former; 
>> trying to press "literal" into service to help the distinction doesn't 
>> help.
>> 
>> At some point, Python distinguished between displays and 
>> comprehensions; I'm assuming someone realized there's no principled 
>> sense in which a comprehension isn't also a display, and now we're 
>> stuck with no word again.
> 
> I don't think comprehensions are displays.

Well, the reference docs say they are. (See 6.2.4 and following.) And I don't 
think the word "display" is used in the tutorial, glossary, etc.; the only 
place it's used, it explicitly includes comprehensions, calls them a "fl

Re: [Python-Dev] Python Language Reference has no mention of list comÃprehensions

2015-12-03 Thread Andrew Barnert via Python-Dev
> On Dec 3, 2015, at 08:15, MRAB  wrote:
> 
>>> On 2015-12-03 15:09, Random832 wrote:
>>> On 2015-12-03, Laura Creighton  wrote:
>>> Who came up with the word 'display' and what does it have going for
>>> it that I have missed?  Right now I think its chief virtue is that
>>> it is a meaningless noun.  (But not meaningless enough, as I
>>> associate displays with output, not construction).
>> 
>> In a recent discussion it seemed like people mainly use it
>> because they don't like using "literal" for things other than
>> single token constants.  In most other languages' contexts the
>> equivalent thing would be called a literal.
> "Literals" also tend to be constants, or be constructed out of
> constants.

I've seen people saying that before, but I don't know where they get that. It's 
certainly not the way, say, C++ or JavaScript use the term. But I don't see any 
point in arguing about it if people just accept that "literal" is too broad a 
term to capture any useful intuition here. 

> A list comprehension can contain functions, etc.

A non-comprehension display can include function calls, lambdas, or any other 
kind of expression, just as easily as a comprehension can. Is [1, x, f(y), 
lambda z: w+z] a literal? If so, why isn't [i*x for i in y] a literal?

The problem is that we need a word that distinguishes the former; trying to 
press "literal" into service to help the distinction doesn't help.

At some point, Python distinguished between displays and comprehensions; I'm 
assuming someone realized there's no principled sense in which a comprehension 
isn't also a display, and now we're stuck with no word again.

>>> I think that
>>> 
>>>6.2.4 Constructing lists, sets and dictionaries
>>> 
>>> would be a much more useful title, and
>>> 
>>>6.2.4 Constructing lists, sets and dictionaries -- explicitly or through 
>>> the use of comprehensions
>> 
>> I don't like the idea of calling it "explicit construction".
>> Explicit construction to me means the actual use of a call to the
>> constructor function.

Agreed.

The obvious mathematical terms are "extension" and "intention", but I get the 
feeling nobody would go for that.

Ultimately, the best we have is "displays that aren't comprehensions" or 
"constructions that aren't comprehensions".

Which means that something like "list, set, and dictionary displays (including 
comprehensions)" is about as good as you can make it without inventing a new 
term. There's nothing to contrast comprehensions with.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deleting with setting C API functions

2015-12-02 Thread Andrew Barnert via Python-Dev
On Dec 2, 2015, at 07:01, Random832  wrote:
> 
> On 2015-12-02, Victor Stinner  wrote:
>>> Are there plans for a Python 4?
>> 
>> No. Don't. Don't schedule any "removal" or *any* kind of "break
>> backward compatibility" anymore, or you will definetly kill the Python
>> community.
> 
> I feel like I should note that I agree with your position here, I was
> just asking the question to articulate the issue that "put it off to the
> indefinite future" isn't a real plan for anything.

Python could just go from 3.9 to 4.0, as a regular dot release, just to dispel 
the idea of an inevitable backward-incompatible "Python 4". (That should be 
around 2 years after the expiration of 2.7 support, py2/py3 naming, etc., 
right?)

Or, of course, Python could avoid the number 4, go to 3.17 and then decide that 
the next release is big enough to be worthy of 5.0.

Or go from 3.9 to 2022, or XP, or Python Enterprise Python 1.  :)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com