Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Erik Bray
On Mon, Oct 8, 2018 at 12:20 PM Cameron Simpson  wrote:
>
> On 08Oct2018 10:56, Ram Rachum  wrote:
> >That's incredibly interesting. I've never used mmap before.
> >However, there's a problem.
> >I did a few experiments with mmap now, this is the latest:
> >
> >path = pathlib.Path(r'P:\huge_file')
> >
> >with path.open('r') as file:
> >mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
>
> Just a remark: don't tromp on the "mmap" name. Maybe "mapped"?
>
> >for match in re.finditer(b'.', mmap):
> >pass
> >
> >The file is 338GB in size, and it seems that Python is trying to load it
> >into memory. The process is now taking 4GB RAM and it's growing. I saw the
> >same behavior when searching for a non-existing match.
> >
> >Should I open a Python bug for this?
>
> Probably not. First figure out what is going on. BTW, how much RAM have you
> got?
>
> As you access the mapped file the OS will try to keep it in memory in case you
> need that again. In the absense of competition, most stuff will get paged out
> to accomodate it. That's normal. All the data are "clean" (unmodified) so the
> OS can simply release the older pages instantly if something else needs the
> RAM.
>
> However, another possibility is the the regexp is consuming lots of memory.
>
> The regexp seems simple enough (b'.'), so I doubt it is leaking memory like
> mad; I'm guessing you're just seeing the OS page in as much of the file as it
> can.

Yup. Windows will aggressively fill up your RAM in cases like this
because after all why not?  There's no use to having memory just
sitting around unused.  For read-only, non-anonymous mappings it's not
much problem for the OS to drop pages that haven't been recently
accessed and use them for something else.  So I wouldn't be too
worried about the process chewing up RAM.

I feel like this is veering more into python-list territory for
further discussion though.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Erik Bray
On Mon, Oct 8, 2018 at 12:23 PM Nathaniel Smith  wrote:
>
> On Mon, Oct 8, 2018 at 2:55 AM, Steven D'Aprano  wrote:
> >
> > On Mon, Oct 08, 2018 at 09:10:40AM +0200, Jimmy Girardet wrote:
> >> Each tool which wants to use pyproject.toml has to add a toml lib  as a
> >> conditional or hard dependency.
> >>
> >> Since toml is now the standard configuration file format,
> >
> > It is? Did I miss the memo? Because I've never even heard of TOML before
> > this very moment.
>
> He's referring to PEPs 518 and 517 [1], which indeed standardize on
> TOML as a file format for Python package build metadata.
>
> I think moving anything into the stdlib would be premature though –
> TOML libraries are under active development, and the general trend in
> the packaging space has been to move things *out* of the stdlib (e.g.
> there's repeated rumblings about moving distutils out), because the
> stdlib release cycle doesn't work well for packaging infrastructure.

If I had the energy to argue it I would also argue against using TOML
in those PEPs.  I personally don't especially care for TOML and what's
"obvious" to Tom is not at all obvious to me.  I'd rather just stick
with YAML or perhaps something even simpler than either one.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2018-09-24 Thread Erik Bray
On Fri, Sep 21, 2018 at 12:58 AM Chris Angelico  wrote:
>
> On Fri, Sep 21, 2018 at 8:52 AM Kyle Lahnakoski  
> wrote:
> > Since the java.lang.Thread.stop() "debacle", it has been obvious that
> > stopping code to run other code has been dangerous.  KeyboardInterrupt
> > (any interrupt really) is dangerous. Now, we can probably code a
> > solution, but how about we remove the danger:
> >
> > I suggest we remove interrupts from Python, and make them act more like
> > java.lang.Thread.interrupt(); setting a thread local bit to indicate an
> > interrupt has occurred.  Then we can write explicit code to check for
> > that bit, and raise an exception in a safe place if we wish.  This can
> > be done with Python code, or convenient places in Python's C source
> > itself.  I imagine it would be easier to whitelist where interrupts can
> > raise exceptions, rather than blacklisting where they should not.
>
> The time machine strikes again!
>
> https://docs.python.org/3/c-api/exceptions.html#signal-handling

Although my original post did not explicitly mention
PyErr_CheckSignals() and friends, it had already taken that into
account and it is not a silver bullet, at least w.r.t. the exact issue
I raised, which had to do with the behavior of context managers versus
the

setup()
try:
do_thing()
finally:
cleanup()

pattern, and the question of how signals are handled between Python
interpreter opcodes.  There is a still-open bug on the issue tracker
discussing the exact issue in greater details:
https://bugs.python.org/issue29988
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Move optional data out of pyc files

2018-04-11 Thread Erik Bray
On Tue, Apr 10, 2018 at 9:50 PM, Eric V. Smith  wrote:
>
>>> 3. Annotations. They are used mainly by third party tools that
>>> statically analyze sources. They are rarely used at runtime.
>>
>> Even less used than docstrings probably.
>
> typing.NamedTuple and dataclasses use annotations at runtime.

Astropy uses annotations at runtime for optional unit checking on
arguments that take dimensionful quantities:
http://docs.astropy.org/en/stable/api/astropy.units.quantity_input.html#astropy.units.quantity_input
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP proposal: unifying function/method classes

2018-03-28 Thread Erik Bray
On Fri, Mar 23, 2018 at 11:25 AM, Antoine Pitrou  wrote:
> On Fri, 23 Mar 2018 07:25:33 +0100
> Jeroen Demeyer  wrote:
>
>> On 2018-03-23 00:36, Antoine Pitrou wrote:
>> > It does make sense, since the proposal sounds ambitious (and perhaps
>> > impossible without breaking compatibility).
>>
>> Well, *some* breakage of backwards compatibility will be unavoidable.
>>
>>
>> My plan (just a plan for now!) is to preserve backwards compatibility in
>> the following ways:
>>
>> * Existing Python attributes of functions/methods should continue to
>> exist and behave the same
>>
>> * The inspect module should give the same results as now (by changing
>> the implementation of some of the functions in inspect to match the new
>> classes)
>>
>> * Everything from the documented Python/C API.
>>
>>
>> This means that I might break compatibility in the following ways:
>>
>> * Changing the classes of functions/methods (this is the whole point of
>> this PEP). So anything involving isinstance() checks might break.
>>
>> * The undocumented parts of the Python/C API, in particular the C structure.
>
> One breaking change would be to add __get__ to C functions.  This means
> e.g. the following:
>
> class MyClass:
> my_open = open
>
> would make my_open a MyClass method, therefore you would need to spell
> it:
>
> class MyClass:
> my_open = staticmethod(open)
>
> ... if you wanted MyClass().my_open('some file') to continue to work.
>
> Of course that might be considered a minor annoyance.

I don't really see your point in this example.  For one: why would
anyone do this?  Is this based on a real example?  2) That's how any
function works.  If you put some arbitrary function in a class body,
and it's not able to accept an instance of that class as its first
argument, then it will always be broken unless you make it a
staticmethod.  I don't see how there should be any difference there if
the function were implemented in Python or in C.

Thanks,
E
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] importlib: making FileFinder easier to extend

2018-02-07 Thread Erik Bray
Hello,

Brief problem statement: Let's say I have a custom file type (say,
with extension .foo) and these .foo files are included in a package
(along with other Python modules with standard extensions like .py and
.so), and I want to make these .foo files importable like any other
module.

On its face, importlib.machinery.FileFinder makes this easy.  I make a
loader for my custom file type (say, FooSourceLoader), and I can use
the FileFinder.path_hook helper like:

sys.path_hooks.insert(0, FileFinder.path_hook((FooSourceLoader, ['.foo'])))
sys.path_importer_cache.clear()

Great--now I can import my .foo modules like any other Python module.
However, any standard Python modules now cannot be imported.  The way
PathFinder sys.meta_path hook works, sys.path_hooks entries are
first-come-first-serve, and furthermore FileFinder.path_hook is very
promiscuous--it will take over module loading for *any* directory on
sys.path, regardless what the file extensions are in that directory.
So although this mechanism is provided by the stdlib, it can't really
be used for this purpose without breaking imports of normal modules
(and maybe it's not intended for that purpose, but the documentation
is unclear).

There are a number of different ways one could get around this.  One
might be to pass FileFinder.path_hook loaders/extension pairs for all
the basic file types known by the Python interpreter.  Unfortunately
there's no great way to get that information.  *I* know that I want to
support .py, .pyc, .so etc. files, and I know which loaders to use for
them.  But that's really information that should belong to the Python
interpreter, and not something that should be reverse-engineered.  In
fact, there is such a mapping provided by
importlib.machinery._get_supported_file_loaders(), but this is not a
publicly documented function.

One could probably think of other workarounds.  For example you could
implement a custom sys.meta_path hook.  But I think it shouldn't be
necessary to go to higher levels of abstraction in order to do
this--the default sys.path handler should be able to handle this use
case.

In order to support adding support for new file types to
sys.path_hooks, I ended up implementing the following hack:

#
import os
import sys

from importlib.abc import PathEntryFinder


@PathEntryFinder.register
class MetaFileFinder:
"""
A 'middleware', if you will, between the PathFinder sys.meta_path hook,
and sys.path_hooks hooks--particularly FileFinder.

The hook returned by FileFinder.path_hook is rather 'promiscuous' in that
it will handle *any* directory.  So if one wants to insert another
FileFinder.path_hook into sys.path_hooks, that will totally take over
importing for any directory, and previous path hooks will be ignored.

This class provides its own sys.path_hooks hook as follows: If inserted
on sys.path_hooks (it should be inserted early so that it can supersede
anything else).  Its find_spec method then calls each hook on
sys.path_hooks after itself and, for each hook that can handle the given
sys.path entry, it calls the hook to create a finder, and calls that
finder's find_spec.  So each sys.path_hooks entry is tried until a spec is
found or all finders are exhausted.
"""

def __init__(self, path):
if not os.path.isdir(path):
raise ImportError('only directories are supported', path=path)

self.path = path
self._finder_cache = {}

def __repr__(self):
return '{}({!r})'.format(self.__class__.__name__, self.path)

def find_spec(self, fullname, target=None):
if not sys.path_hooks:
return None

for hook in sys.path_hooks:
if hook is self.__class__:
continue

finder = None
try:
if hook in self._finder_cache:
finder = self._finder_cache[hook]
if finder is None:
# We've tried this finder before and got an ImportError
continue
except TypeError:
# The hook is unhashable
pass

if finder is None:
try:
finder = hook(self.path)
except ImportError:
pass

try:
self._finder_cache[hook] = finder
except TypeError:
# The hook is unhashable for some reason so we don't bother
# caching it
pass

if finder is not None:
spec = finder.find_spec(fullname, target)
if spec is not None:
return spec

# Module spec not found through any of the finders
return None

def invalidate_caches(self):
for finder in self._finder_cache.values():
finder.invalidate_caches()


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-29 Thread Erik Bray
On Thu, Dec 28, 2017 at 8:42 PM, Serhiy Storchaka <storch...@gmail.com> wrote:
> 28.12.17 12:10, Erik Bray пише:
>>
>> There's no index() alternative to int().
>
>
> operator.index()

Okay, and it's broken.  That doesn't change my other point that some
functions that could previously take non-int arguments can no
longer--if we agree on that at least then I can set about making a bug
report and fixing it.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-28 Thread Erik Bray
On Fri, Dec 8, 2017 at 7:20 PM, Ethan Furman <et...@stoneleaf.us> wrote:
> On 12/08/2017 04:33 AM, Erik Bray wrote:
>
>> More importantly not as many objects that coerce to int actually
>> implement __index__.  They probably *should* but there seems to be
>> some confusion about how that's to be used.
>
>
> __int__ is for coercion (float, fraction, etc)
>
> __index__ is for true integers
>
> Note that if __index__ is defined, __int__ should also be defined, and
> return the same value.
>
> https://docs.python.org/3/reference/datamodel.html#object.__index__

This doesn't appear to be enforced, though I think maybe it should be.

I'll also note that because of the changes I pointed out in my
original post, it's now necessary for me to explicitly cast as int()
objects that previously "just worked" when passed as arguments in some
functions in itertools, collections, and other modules with C
implementations.  However, this is bad because if some broken code is
passing floats to these arguments, they will be quietly cast to int
and succeed, when really I should only be accepting objects that have
__index__.  There's no index() alternative to int().

I think changing all these functions to do the appropriate
PyIndex_Check is a correct and valid fix, but I think it also
stretches beyond the original purpose of __index__.  I think that
__index__ is relatively unknown, and perhaps there should be better
documentation as to when and how it should be used over the
better-known __int__.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-08 Thread Erik Bray
On Fri, Dec 8, 2017 at 1:52 PM, Antoine Pitrou  wrote:
> On Fri, 8 Dec 2017 14:30:00 +0200
> Serhiy Storchaka 
> wrote:
>>
>> NumPy integers implement __index__.
>
> That doesn't help if a function calls e.g. PyLong_AsLongAndOverflow().

Right--pointing to __index__ basically implies that PyIndex_Check and
subsequent PyNumber_AsSsize_t than there currently are.  That I could
agree with but then it becomes a question of where are those cases?
And what do with, e.g. interfaces like PyLong_AsLongAndOverflow().
Add more PyNumber_ conversion functions?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-08 Thread Erik Bray
On Fri, Dec 8, 2017 at 12:26 PM, Serhiy Storchaka <storch...@gmail.com> wrote:
> 08.12.17 12:41, Erik Bray пише:
>>
>> IIUC, it seems to be carry-over from Python 2's PyLong API, but I
>> don't see an obvious reason for it.  In every case there's an explicit
>> PyLong_Check first anyways, so not calling __int__ doesn't help for
>> the common case of exact int objects; adding the fallback costs
>> nothing in that case.
>
>
> There is also a case of int subclasses. It is expected that PyLong_AsLong is
> atomic, and calling __int__ can lead to crashes or similar consequences.
>
>> I ran into this because I was passing an object that implements
>> __int__ to the maxlen argument to deque().  On Python 2 this used
>> PyInt_AsSsize_t which does fall back to calling __int__, whereas
>> PyLong_AsSsize_t does not.
>
>
> PyLong_* functions provide an interface to PyLong objects. If they don't
> return the content of a PyLong object, how can it be retrieved? If you want
> to work with general numbers you should use PyNumber_* functions.

By "you " I assume you meant the generic "you".  I'm not the one who
broke things in this case :)

> In your particular case it is more reasonable to fallback to __index__
> rather than __int__. Unlikely maxlen=4.2 makes sense.

That's true, but in Python 2 that was possible:

>>> deque([], maxlen=4.2)
deque([], maxlen=4)

More importantly not as many objects that coerce to int actually
implement __index__.  They probably *should* but there seems to be
some confusion about how that's to be used.  It was mainly motivated
by slices, but it *could* be used in general cases where it definitely
wouldn't make sense to accept a float (I wonder if maybe the real
problem here is that floats can be coerced automatically to ints)

In other words, there are probably countless other cases in the stdlib
at all where it "doesn't make sense" to accept a float, but that
otherwise should accept objects that can be coerced to int without
having to manually wrap those objects with an int(o) call.

>> Currently the following functions fall back on __int__ where available:
>>
>> PyLong_AsLong
>> PyLong_AsLongAndOverflow
>> PyLong_AsLongLong
>> PyLong_AsLongLongAndOverflow
>> PyLong_AsUnsignedLongMask
>> PyLong_AsUnsignedLongLongMask
>
>
> I think this should be deprecated (and there should be an open issue for
> this). Calling __int__ is just a Python 2 legacy.

Okay, but then there are probably many cases where they should be
replaced with PyNumber_ equivalents or else who knows how much code
would break.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?

2017-12-08 Thread Erik Bray
IIUC, it seems to be carry-over from Python 2's PyLong API, but I
don't see an obvious reason for it.  In every case there's an explicit
PyLong_Check first anyways, so not calling __int__ doesn't help for
the common case of exact int objects; adding the fallback costs
nothing in that case.

I ran into this because I was passing an object that implements
__int__ to the maxlen argument to deque().  On Python 2 this used
PyInt_AsSsize_t which does fall back to calling __int__, whereas
PyLong_AsSsize_t does not.

Currently the following functions fall back on __int__ where available:

PyLong_AsLong
PyLong_AsLongAndOverflow
PyLong_AsLongLong
PyLong_AsLongLongAndOverflow
PyLong_AsUnsignedLongMask
PyLong_AsUnsignedLongLongMask

whereas the following (at least according to the docs--haven't checked
the code in all cases) do not:

PyLong_AsSsize_t
PyLong_AsUnsignedLong
PyLong_AsSize_t
PyLong_AsUnsignedLongLong
PyLong_AsDouble
PyLong_AsVoidPtr

I think this inconsistency should be fixed, unless there's some reason
for it I'm not seeing.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] install pip packages from Python prompt

2017-11-04 Thread Erik Bray
On Nov 4, 2017 08:31, "Stephen J. Turnbull" <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

Erik Bray writes:

 > Nope.  I totally get that they don’t know what a shell or command prompt
 > is.  THEY. NEED. TO. LEARN.


Just to be clear I did not write this. Someone replying to me did.

I'm going to go over all the different proposals in this thread and see if
I can synthesize a list of options. I think, even if it's not a solution
that winds up in the stdlib, it would be good to have some user stories
about how package installation from within an interactive prompt might work
(even if not from the standard REPL, which it should be noted has had small
improvements made to it over the years).

I also have my doubts about whether this *shouldn't* be possible. I mean,
to a lot of beginners starting out the basic REPL *is* Python. They're so
new to the scene they don't even know what IPython or Jupyter is or why
they might want that. They aren't experienced enough to even know what
they're missing out on. In classrooms we can resolve that easily by
pointing our students to whatever tools we think will work best for them,
but not everyone has that privilege.

Best,
Erik

I don't want to take a position on the proposal, and I agree that we
should *strongly* encourage everyone to learn.  But "THEY. NEED. TO.
LEARN." is not obvious to me.

Anecdotally, my students are doing remarkably (to me, as a teacher)
complex modeling with graphical interfaces to statistical and
simulation packages (SPSS/AMOS, Artisoc, respectively), and collection
of large textual databases from SNS with cargo-culted Python programs.
For the past twenty years teaching social scientists, these accidental
barriers (as Fred Brooks would have called them) have dropped
dramatically, to the point where it's possible to do superficially
good-looking (= complex) but entirely meaningless :-/ empirical
research.  (In some ways I think this lowered cost has been horribly
detrimental to my work as an educator in applied social science. ;-)

The point being that "user-friendly" UI in many fields where (fairly)
advanced computing is used is more than keeping up with the perceived
needs of most computer users, while the essential (in the sense of
Brooks) non-computing modeling difficulties of their jobs remain.

By "perceived" I mean I want my students using TeX, but it's hard to
force them when all their professors (except me and a couple
mathematicians) use Word (speaking of irreproducible results).  It's
good enough for government work, and that's in fact where many of them
end up (and the great majority are either in government or in
equivalent corporate bureaucrat positions).  Yes, I meant the
deprecatory connotations of "perceived", but realistically, I admit
that maybe they *don't* *need* the more polished tech that I could
teach them.


I remember when I first started out teaching Software Carpentry I made the
embarrassing mistake (coming from Physics) of assuming that LaTex is
de-facto in most other academic fields :)

 > Hiding it is not a good idea for anyone.

Agreed.  Command lines and REPLs teach humility, to me as well as my
students. :-)

Steve


--
Associate Professor  Division of Policy and Planning Science
http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information
Email: turnb...@sk.tsukuba.ac.jp   University of Tsukuba
Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] install pip packages from Python prompt

2017-11-02 Thread Erik Bray
On Oct 30, 2017 8:57 PM, "Alex Walters" <tritium-l...@sdamon.com> wrote:



> -Original Message-
> From: Python-ideas [mailto:python-ideas-bounces+tritium-
> list=sdamon@python.org] On Behalf Of Erik Bray
> Sent: Monday, October 30, 2017 6:28 AM
> To: Python-Ideas <python-ideas@python.org>
> Subject: Re: [Python-ideas] install pip packages from Python prompt
>
> On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters <tritium-l...@sdamon.com>
> wrote:
> > Then those users have more fundamental problems.  There is a minimum
> level
> > of computer knowledge needed to be successful in programming.
> Insulating
> > users from the reality of the situation is not preparing them to be
> > successful.  Pretending that there is no system command prompt, or
shell,
> or
> > whatever platform specific term applies, only hurts new programmers.
> Give
> > users an error message they can google, and they will be better off in
the
> > long run than they would be if we just ran pip for them.
>
> While I completely agree with this in principle, I think you
> overestimate the average beginner.

Nope.  I totally get that they don’t know what a shell or command prompt
is.  THEY. NEED. TO. LEARN.  Hiding it is not a good idea for anyone.  If
this is an insurmountable problem for the newbie, maybe they really
shouldn’t be attempting to program.  This field is not for everyone.


Reading this I get the impression, and correct me if I'm wrong, that you've
never taught beginners programming. Of course long term (heck in fact
fairly early on) they need to learn these nitty-gritty and sometimes
frustrating lessons, but not in a 2 hour intro to programming for total
beginners.

And I beg to differ--this field is for everyone, and increasingly moreso
every day. Doesn't mean it's easy, but it is and can be for everyone.

Whether this specific proposal is technically feasible in a cross-platform
manner with the state of the Python interpreter and import system is
another question. But that's a discussion worth having. "Some people aren't
cut out for programming" isn't.


>  Many beginners I've taught or
> helped, even if they can manage to get to the correct command prompt,
> often don't even know how to run the correct Python.  They might often
> have multiple Pythons installed on their system--maybe they have
> Anaconda, maybe Python installed by homebrew, or a Python that came
> with an IDE like Spyder.  If they're on OSX often running "python"
> from the command prompt gives the system's crippled Python 2.6 and
> they don't know the difference.
>
> One thing that has been a step in the right direction is moving more
> documentation toward preferring running `python -m pip` over just
> `pip`, since this often has a better guarantee of running `pip` in the
> Python interpreter you intended.  But that still requires one to know
> how to run the correct Python interpreter from the command-line (which
> the newbie double-clicking on IDLE may not even have a concept of...).
>
> While I agree this is something that is important for beginners to
> learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for
> many newbies just to install one or two packages from pip, which they
> often might need/want to do for whatever educational pursuit they're
> following (heck, it's pretty common even just to want to install the
> `requests` module, as I would never throw `urllib` at a beginner).
>
> So while I don't think anything proposed here will work technically, I
> am in favor of an in-interpreter pip install functionality.  Perhaps
> it could work something like this:
>
> a) Allow it *only* in interactive mode:  running `pip(...)` (or
> whatever this looks like) outside of interactive mode raises a
> `RuntimeError` with the appropriate documentation
> b) When running `pip(...)` the user is supplied with an interactive
> prompt explaining that since installing packages with `pip()` can
> result in changes to the interpreter, it is necessary to restart the
> interpreter after installation--give them an opportunity to cancel the
> action in case they have any work they need to save.  If they proceed,
> install the new package then restart the interpreter for them.  This
> avoids any ambiguity as to states of loaded modules before/after pip
> install.
> > From: Stephan Houben [mailto:stephan...@gmail.com]
> > Sent: Sunday, October 29, 2017 3:43 PM
> > To: Alex Walters <tritium-l...@sdamon.com>
> > Cc: Python-Ideas <python-ideas@python.org>
> > Subject: Re: [Python-ideas] install pip packages from Python prompt
> >
> >
> >
> > Hi Alex,
> >
> >
> >
> > 2017-10-29 20:26 GMT+01:00 Alex Walters <tri

Re: [Python-ideas] install pip packages from Python prompt

2017-10-30 Thread Erik Bray
On Mon, Oct 30, 2017 at 11:27 AM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters <tritium-l...@sdamon.com> wrote:
>> Then those users have more fundamental problems.  There is a minimum level
>> of computer knowledge needed to be successful in programming.  Insulating
>> users from the reality of the situation is not preparing them to be
>> successful.  Pretending that there is no system command prompt, or shell, or
>> whatever platform specific term applies, only hurts new programmers.  Give
>> users an error message they can google, and they will be better off in the
>> long run than they would be if we just ran pip for them.
>
> While I completely agree with this in principle, I think you
> overestimate the average beginner.  Many beginners I've taught or
> helped, even if they can manage to get to the correct command prompt,
> often don't even know how to run the correct Python.  They might often
> have multiple Pythons installed on their system--maybe they have
> Anaconda, maybe Python installed by homebrew, or a Python that came
> with an IDE like Spyder.  If they're on OSX often running "python"
> from the command prompt gives the system's crippled Python 2.6 and
> they don't know the difference.


I should add--another case that is becoming extremely common is
beginners learning Python for the first time inside the
Jupyter/IPython Notebook.  And in my experience it can be very
difficult for beginners to understand the connection between what's
happening in the notebook ("it's in the web-browser--what does that
have to do with anything on my computer??") and the underlying Python
interpreter, file system, etc.  Being able to pip install from within
the Notebook would be a big win.  This is already possible since
IPython allows running system commands and it is possible to run the
pip executable from the notebook, then manually restart the Jupyter
kernel.

It's not 100% clear to me how my proposal below would work within a
Jupyter Notebook, so that would also be an angle worth looking into.

Best,
Erik


> One thing that has been a step in the right direction is moving more
> documentation toward preferring running `python -m pip` over just
> `pip`, since this often has a better guarantee of running `pip` in the
> Python interpreter you intended.  But that still requires one to know
> how to run the correct Python interpreter from the command-line (which
> the newbie double-clicking on IDLE may not even have a concept of...).
>
> While I agree this is something that is important for beginners to
> learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for
> many newbies just to install one or two packages from pip, which they
> often might need/want to do for whatever educational pursuit they're
> following (heck, it's pretty common even just to want to install the
> `requests` module, as I would never throw `urllib` at a beginner).
>
> So while I don't think anything proposed here will work technically, I
> am in favor of an in-interpreter pip install functionality.  Perhaps
> it could work something like this:
>
> a) Allow it *only* in interactive mode:  running `pip(...)` (or
> whatever this looks like) outside of interactive mode raises a
> `RuntimeError` with the appropriate documentation
> b) When running `pip(...)` the user is supplied with an interactive
> prompt explaining that since installing packages with `pip()` can
> result in changes to the interpreter, it is necessary to restart the
> interpreter after installation--give them an opportunity to cancel the
> action in case they have any work they need to save.  If they proceed,
> install the new package then restart the interpreter for them.  This
> avoids any ambiguity as to states of loaded modules before/after pip
> install.
>
>
>
>> From: Stephan Houben [mailto:stephan...@gmail.com]
>> Sent: Sunday, October 29, 2017 3:43 PM
>> To: Alex Walters <tritium-l...@sdamon.com>
>> Cc: Python-Ideas <python-ideas@python.org>
>> Subject: Re: [Python-ideas] install pip packages from Python prompt
>>
>>
>>
>> Hi Alex,
>>
>>
>>
>> 2017-10-29 20:26 GMT+01:00 Alex Walters <tritium-l...@sdamon.com>:
>>
>> return “Please run pip from your system command prompt”
>>
>>
>>
>>
>>
>> The target audience for my proposal are people who do not know
>>
>> which part of the sheep the "system command prompt" is.
>>
>> Stephan
>>
>>
>>
>>
>>
>> From: Python-ideas
>> [mailto:python-ideas-bounces+tritium-list=sdamon@python.org] On Behalf
>> Of Stephan Houben
>> Sent: Sunda

Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
On Wed, Jun 28, 2017 at 3:19 PM, Greg Ewing <greg.ew...@canterbury.ac.nz> wrote:
> Erik Bray wrote:
>>
>> At this point a potentially
>> waiting SIGINT is handled, resulting in KeyboardInterrupt being raised
>> while inside the with statement's suite, and finally block, and hence
>> Lock.__exit__ are entered.
>
>
> Seems to me this is the behaviour you *want* in this case,
> otherwise the lock can be acquired and never released.
> It's disconcerting that it seems to be very difficult to
> get that behaviour with a pure Python implementation.

I think normally you're right--this is the behavior you would *want*,
but not the behavior that's consistent with how Python implements the
`with` statement, all else being equal.  Though it's still not
entirely fair either because if Lock.__enter__ were pure Python
somehow, it's possible the exception would be raised either before or
after the lock is actually marked as "acquired", whereas in the C
implementation acquisition of the lock will always succeed (assuming
the lock was free, and no other exceptional conditions) before the
signal handler is executed.

>> I think it might be possible to
>> gain more consistency between these cases if pending signals are
>> checked/handled after any direct call to PyCFunction from within the
>> ceval loop.
>
>
> IMO that would be going in the wrong direction by making
> the C case just as broken as the Python case.
>
> Instead, I would ask what needs to be done to make this
> work correctly in the Python case as well as the C case.

You have a point there, but at the same time the Python case, while
"broken" insofar as it can lead to broken code, seems correct from the
Pythonic perspective.  The other possibility would be to actually
change the semantics of the `with` statement. Or as you mention below,
a way to temporarily mask signals...

> I don't think it's even possible to write Python code that
> does this correctly at the moment. What's needed is a
> way to temporarily mask delivery of asynchronous exceptions
> for a region of code, but unless I've missed something,
> no such facility is currently provided.
>
> What would such a facility look like? One possibility
> would be to model it on the sigsetmask() system call, so
> there would be a function such as
>
>mask_async_signals(bool)
>
> that turns delivery of async signals on or off.
>
> However, I don't think that would work. To fix the locking
> case, what we need to do is mask async signals during the
> locking operation, and only unmask them once the lock has
> been acquired. We might write a context manager with an
> __enter__ method like this:
>
>def __enter__(self):
>   mask_async_signals(True)
>   try:
>  self.acquire()
>   finally:
>  mask_async_signals(False)
>
> But then we have the same problem again -- if a Keyboard
> Interrupt occurs after mask_async_signals(False) but
> before __enter__ returns, the lock won't get released.

Exactly.

> Another approach would be to provide a context manager
> such as
>
>async_signals_masked(bool)
>
> Then the whole locking operation could be written as
>
>with async_signals_masked(True):
>   lock.acquire()
>   try:
>  with async_signals_masked(False):
> # do stuff here
>   finally:
>  lock.release()
>
> Now there's no possibility for a KeyboardInterrupt to
> be delivered until we're safely inside the body, but we've
> lost the ability to capture the pattern in the form of
> a context manager.
>
> The only way out of this I can think of at the moment is
> to make the above pattern part of the context manager
> protocol itself. In other words, async exceptions are
> always masked while the __enter__ and __exit__ methods
> are executing, and unmasked while the body is executing.

I think so too.  That's more or less in line with Nick's idea on njs's
issue (https://bugs.python.org/issue29988) of an ATOMIC_UNTIL opcode.
That's just one implementation possibility.  My question would be to
make that a language-level requirement of the context manager
protocol, or just something CPython does...

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
On Wed, Jun 28, 2017 at 3:09 PM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
>> On 28 June 2017 at 21:40, Erik Bray <erik.m.b...@gmail.com> wrote:
>>> My colleague's contention is that given
>>>
>>> lock = threading.Lock()
>>>
>>> this is simply *wrong*:
>>>
>>> lock.acquire()
>>> try:
>>> do_something()
>>> finally:
>>> lock.release()
>>>
>>> whereas this is okay:
>>>
>>> with lock:
>>> do_something()
>>
>> Technically both are slightly racy with respect to async signals (e.g.
>> KeyboardInterrupt), but the with statement form is less exposed to the
>> problem (since it does more of its work in single opcodes).
>>
>> Nathaniel Smith posted a good write-up of the technical details to the
>> issue tracker based on his work with trio:
>> https://bugs.python.org/issue29988
>
> Interesting; thanks for pointing this out.  Part of me felt like this
> has to have come up before but my searching didn't bring this up
> somehow (and even then it's only a couple months old itself).
>
> I didn't think about the possible race condition before
> WITH_CLEANUP_START, but obviously that's a possibility as well.
> Anyways since this is already acknowledged as a real bug I guess any
> further followup can happen on the issue tracker.

On second thought, maybe there is a case to made w.r.t. making a
documentation change about the semantics of the `with` statement:

The old-style syntax cannot make any guarantees about atomicity w.r.t.
async events.  That is, there's no way syntactically in Python to
declare that no exception will be raised between "lock.acquire()" and
the setup of the "try/finally" blocks.

However, if issue-29988 were *fixed* somehow (and I'm not convinced it
can't be fixed in the limited case of `with` statements) then there
really would be a major semantic difference of the `with` statement in
that it does support this invariant.  Then the question is whether
that difference be made a requirement of the language (probably too
onerous a requirement?), or just a feature of CPython (which should
still be documented one way or the other IMO).

Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 28 June 2017 at 21:40, Erik Bray <erik.m.b...@gmail.com> wrote:
>> My colleague's contention is that given
>>
>> lock = threading.Lock()
>>
>> this is simply *wrong*:
>>
>> lock.acquire()
>> try:
>> do_something()
>> finally:
>> lock.release()
>>
>> whereas this is okay:
>>
>> with lock:
>> do_something()
>
> Technically both are slightly racy with respect to async signals (e.g.
> KeyboardInterrupt), but the with statement form is less exposed to the
> problem (since it does more of its work in single opcodes).
>
> Nathaniel Smith posted a good write-up of the technical details to the
> issue tracker based on his work with trio:
> https://bugs.python.org/issue29988

Interesting; thanks for pointing this out.  Part of me felt like this
has to have come up before but my searching didn't bring this up
somehow (and even then it's only a couple months old itself).

I didn't think about the possible race condition before
WITH_CLEANUP_START, but obviously that's a possibility as well.
Anyways since this is already acknowledged as a real bug I guess any
further followup can happen on the issue tracker.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Asynchronous exception handling around with/try statement borders

2017-06-28 Thread Erik Bray
Hi folks,

I normally wouldn't bring something like this up here, except I think
that there is possibility of something to be done--a language
documentation clarification if nothing else, though possibly an actual
code change as well.

I've been having an argument with a colleague over the last couple
days over the proper way order of statements when setting up a
try/finally to perform cleanup of some action.  On some level we're
both being stubborn I think, and I'm not looking for resolution as to
who's right/wrong or I wouldn't bring it to this list in the first
place.  The original argument was over setting and later restoring
os.environ, but we ended up arguing over
threading.Lock.acquire/release which I think is a more interesting
example of the problem, and he did raise a good point that I do want
to bring up.



My colleague's contention is that given

lock = threading.Lock()

this is simply *wrong*:

lock.acquire()
try:
do_something()
finally:
lock.release()

whereas this is okay:

with lock:
do_something()


Ignoring other details of how threading.Lock is actually implemented,
assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls
release() then as far as I've known ever since Python 2.5 first came
out these two examples are semantically *equivalent*, and I can't find
any way of reading PEP 343 or the Python language reference that would
suggest otherwise.

However, there *is* a difference, and has to do with how signals are
handled, particularly w.r.t. context managers implemented in C (hence
we are talking CPython specifically):

If Lock.__enter__ is a pure Python method (even if it maybe calls some
C methods), and a SIGINT is handled during execution of that method,
then in almost all cases a KeyboardInterrupt exception will be raised
from within Lock.__enter__--this means the suite under the with:
statement is never evaluated, and Lock.__exit__ is never called.  You
can be fairly sure the KeyboardInterrupt will be raised from somewhere
within a pure Python Lock.__enter__ because there will usually be at
least one remaining opcode to be evaluated, such as RETURN_VALUE.
Because of how delayed execution of signal handlers is implemented in
the pyeval main loop, this means the signal handler for SIGINT will be
called *before* RETURN_VALUE, resulting in the KeyboardInterrupt
exception being raised.  Standard stuff.

However, if Lock.__enter__ is a PyCFunction things are quite
different.  If you look at how the SETUP_WITH opcode is implemented,
it first calls the __enter__ method with _PyObjet_CallNoArg.  If this
returns NULL (i.e. an exception occurred in __enter__) then "goto
error" is executed and the exception is raised.  However if it returns
non-NULL the finally block is set up with PyFrame_BlockSetup and
execution proceeds to the next opcode.  At this point a potentially
waiting SIGINT is handled, resulting in KeyboardInterrupt being raised
while inside the with statement's suite, and finally block, and hence
Lock.__exit__ are entered.

Long story short, because Lock.__enter__ is a C function, assuming
that it succeeds normally then

with lock:
do_something()

always guarantees that Lock.__exit__ will be called if a SIGINT was
handled inside Lock.__enter__, whereas with

lock.acquire()
try:
...
finally:
lock.release()

there is at last a small possibility that the SIGINT handler is called
after the CALL_FUNCTION op but before the try/finally block is entered
(e.g. before executing POP_TOP or SETUP_FINALLY).  So the end result
is that the lock is held and never released after the
KeyboardInterrupt (whether or not it's handled somehow).

Whereas, again, if Lock.__enter__ is a pure Python function there's
less likely to be any difference (though I don't think the possibility
can be ruled out entirely).

At the very least I think this quirk of CPython should be mentioned
somewhere (since in all other cases the semantic meaning of the
"with:" statement is clear).  However, I think it might be possible to
gain more consistency between these cases if pending signals are
checked/handled after any direct call to PyCFunction from within the
ceval loop.

Sorry for the tl;dr; any thoughts?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] π = math.pi

2017-06-02 Thread Erik Bray
On Fri, Jun 2, 2017 at 7:52 AM, Greg Ewing  wrote:
> Victor Stinner wrote:
>>
>> How do you write π (pi) with a keyboard on Windows, Linux or macOS?
>
>
> On a Mac, π is Option-p and ∑ is Option-w.

I don't have a strong opinion about it being in the stdlib, but I'd
also point out that a strong advantage to having these defined in a
module at all is that third-party interpreters (e.g. IPython, bpython,
some IDEs) that support tab-completion make these easy to type as
well, and I find them to be very readable for math-heavy code.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-30 Thread Erik Bray
On Fri, Dec 30, 2016 at 5:05 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 29 December 2016 at 22:12, Erik Bray <erik.m.b...@gmail.com> wrote:
>>
>> 1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the
>> implementation--that the keys are integers starting from zero)
>> 2) pthreads: Does not definite an uninitialized default value for
>> keys, for reasons described at [1] under "Non-Idempotent Data Key
>> Creation".  I understand their reasoning, though I can't claim to know
>> specifically what they mean when they say that some implementations
>> would require the mutual-exclusion to be performed on
>> pthread_getspecific() as well.  I don't know that it applies here.
>
>
> That section is a little weird, as they describe two requests (one for a
> known-NULL default value, the other for implicit synchronisation of key
> creation to prevent race conditions), and only provide the justification for
> rejecting one of them (the second one).

Right, that is confusing to me as well. I'm guessing the reason for
rejecting the first is in part a way to force us to recognize the
second issue.

> If I've understood correctly, the situation they're worried about there is
> that pthread_key_create() has to be called at least once-per-process, but
> must be called before *any* call to pthread_getspecific or
> pthread_setspecific for a given key. If you do "implicit init" rather than
> requiring the use of an explicit mechanism like pthread_once (or our own
> Py_Initialize and module import locks), then you may take a small
> performance hit as either *every* thread then has to call
> pthread_key_create() to ensure the key exists before using it, or else
> pthread_getspecific() and pthread_setspecific() have to become potentially
> blocking calls. Neither of those is desirable, so it makes sense to leave
> that part of the problem to the API client.
>
> In our case, we don't want the implicit synchronisation, we just want the
> known-NULL default value so the "Is it already set?" check can be moved
> inside the library function.

Okay, we're on the same page here then.  I just wanted to make sure
there wasn't anything else I was missing in Python's case.

>> 3) windows: The return value of TlsAlloc() is a DWORD (unsigned int)
>> and [2] states that its value should be opaque.
>>
>> So in principle we can cover all cases with an opaque struct that
>> contains, as its first member, an is_initialized flag.  The tricky
>> part is how to initialize the rest of the struct (containing the
>> underlying implementation-specific key).  For 1) and 3) it doesn't
>> matter--it can just be zero.  For 2) it's trickier because there's no
>> defined constant value to initialize a pthread_key_t to.
>>
>> Per Nick's suggestion this can be worked around by relying on C99's
>> initialization semantics. Per [3] section 6.7.8, clause 21:
>>
>> """
>> If there are fewer initializers in a brace-enclosed list than there
>> are elements or members of an aggregate, or fewer characters in a
>> string literal used to initialize an array of known size than there
>> are elements in the array, the remainder of the aggregate shall be
>> initialized implicitly the same as objects that have static storage
>> duration.
>> """
>>
>> How objects with static storage are initialized is described in the
>> previous page under clause 10, but in practice it boils down to what
>> you would expect: Everything is initialized to zero, including nested
>> structs and arrays.
>>
>> So as long as we can use this feature of C99 then I think that's the
>> best approach.
>
>
>
> I checked PEP 7 to see exactly which features we've added to the approved C
> dialect, and designated initialisers are already on the list:
> https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
>
> So I believe that would allow the initializer to be declared as something
> like:
>
> #define Py_tss_NEEDS_INIT {.is_initialized = false}

Great!  One could argue about whether or not the designated
initializer syntax also incorporates omitted fields, but it would seem
strange to insist that it doesn't.

Have a happy new year,

Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-29 Thread Erik Bray
On Wed, Dec 21, 2016 at 5:07 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 21 December 2016 at 20:01, Erik Bray <erik.m.b...@gmail.com> wrote:
>>
>> On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
>> > Option 2: Similar to option 1, but using a custom type alias, rather
>> > than
>> > using a C99 bool directly
>> >
>> > The closest API we have to these semantics at the moment would be
>> > PyGILState_Ensure, so the following API naming might work for option 2:
>> >
>> > Py_ensure_t
>> > Py_ENSURE_NEEDS_INIT
>> > Py_ENSURE_INITIALIZED
>> >
>> > Respectively, these would just be aliases for bool, false, and true.
>> >
>> > And then modify the proposed PyThread_tss_create and PyThread_tss_delete
>> > APIs to accept a "Py_ensure_t *init_flag" in addition to their current
>> > arguments.
>>
>> That all sounds good--between the two option 2 looks a bit more explicit.
>>
>> Though what about this?  Rather than adding another type, the original
>> proposal could be changed slightly so that Py_tss_t *is* partially
>> defined as a struct consisting of a bool, with whatever the native TLS
>> key is.   E.g.
>>
>> typedef struct {
>> bool init_flag;
>> #if defined(_POSIX_THREADS)
>> pthreat_key_t key;
>> #elif defined (NT_THREADS)
>> DWORD key;
>> /* etc... */
>> } Py_tss_t;
>>
>> Then it's just taking Masayuki's original patch, with the global bool
>> variables, and formalizing that by combining the initialized flag with
>> the key, and requiring the semantics you described above for
>> PyThread_tss_create/delete.
>>
>> For Python's purposes it seems like this might be good enough, with
>> the more general purpose pthread_once-like functionality not required.
>
>
> Aye, I also thought of that approach, but talked myself out of it since
> there's no definable default value for pthread_key_t. However, C99 partial
> initialisation may deal with that for us (by zeroing the memory without
> actually assigning a typed value to it), and if it does, I agree it would be
> better to handle the initialisation flag automatically rather than requiring
> callers to do it.

I think I understand what you're saying here...  To be clear, let me
enumerate the three currently supported cases and how they're
affected:

1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the
implementation--that the keys are integers starting from zero)
2) pthreads: Does not definite an uninitialized default value for
keys, for reasons described at [1] under "Non-Idempotent Data Key
Creation".  I understand their reasoning, though I can't claim to know
specifically what they mean when they say that some implementations
would require the mutual-exclusion to be performed on
pthread_getspecific() as well.  I don't know that it applies here.
3) windows: The return value of TlsAlloc() is a DWORD (unsigned int)
and [2] states that its value should be opaque.

So in principle we can cover all cases with an opaque struct that
contains, as its first member, an is_initialized flag.  The tricky
part is how to initialize the rest of the struct (containing the
underlying implementation-specific key).  For 1) and 3) it doesn't
matter--it can just be zero.  For 2) it's trickier because there's no
defined constant value to initialize a pthread_key_t to.

Per Nick's suggestion this can be worked around by relying on C99's
initialization semantics. Per [3] section 6.7.8, clause 21:

"""
If there are fewer initializers in a brace-enclosed list than there
are elements or members of an aggregate, or fewer characters in a
string literal used to initialize an array of known size than there
are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage
duration.
"""

How objects with static storage are initialized is described in the
previous page under clause 10, but in practice it boils down to what
you would expect: Everything is initialized to zero, including nested
structs and arrays.

So as long as we can use this feature of C99 then I think that's the
best approach.



[1] 
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
[2] 
https://msdn.microsoft.com/en-us/library/windows/desktop/ms686801(v=vs.85).aspx
[3] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-21 Thread Erik Bray
On Wed, Dec 21, 2016 at 11:01 AM, Erik Bray <erik.m.b...@gmail.com> wrote:
> That all sounds good--between the two option 2 looks a bit more explicit.
>
> Though what about this?  Rather than adding another type, the original
> proposal could be changed slightly so that Py_tss_t *is* partially
> defined as a struct consisting of a bool, with whatever the native TLS
> key is.   E.g.
>
> typedef struct {
> bool init_flag;
> #if defined(_POSIX_THREADS)
> pthreat_key_t key;

*pthread_key_t* of course, though I wonder if that was a Freudian slip :)

> #elif defined (NT_THREADS)
> DWORD key;
> /* etc... */
> } Py_tss_t;
>
> Then it's just taking Masayuki's original patch, with the global bool
> variables, and formalizing that by combining the initialized flag with
> the key, and requiring the semantics you described above for
> PyThread_tss_create/delete.
>
> For Python's purposes it seems like this might be good enough, with
> the more general purpose pthread_once-like functionality not required.

Of course, that's not to say it might not be useful for some other
purpose, but then it's outside the scope of this discussion as long as
it isn't needed for TLS key initialization.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-21 Thread Erik Bray
On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 21 December 2016 at 01:35, Masayuki YAMAMOTO <ma3yuki.8mam...@gmail.com>
> wrote:
>>
>> 2016-12-20 22:30 GMT+09:00 Erik Bray <erik.m.b...@gmail.com>:
>>>
>>> This is probably an implementation detail, but ISTM that even with
>>> PyThread_call_once, it will be necessary to reset any used once_flags
>>> manually in PyOS_AfterFork, essentially for the same reason the
>>> autoTLSkey is reset there currently...
>>
>>
>> Deleting threads key is executed on *_Fini functions, but Py_FinalizeEx
>> function that calls *_Fini functions doesn't terminate CPython interpreter.
>> Furthermore, source comment and document have said description about
>> reinitialization after calling Py_FinalizeEx. [1] [2] That is to say there
>> is an implicit possible that is reinitialization contrary to name
>> "call_once" on a process level. Therefore, if CPython interpreter continues
>> to allow reinitialization, I'd suggest to rename the call_once API to avoid
>> misreading semantics. (for example, safe_init, check_init)
>
>
> Ouch, I'd missed that, and I agree it's not a negligible implementation
> detail - there are definitely applications embedding CPython out there that
> rely on being able to run multiple Initialize/Finalize cycles in the same
> process and have everything "just work". It also means using the
> "PyThread_*" prefix for the initialisation tracking aspect would be
> misleading, since the life cycle details are:
>
> 1. Create the key for the first time if it has never been previously set in
> the process
> 2. Destroy and reinit if Py_Finalize gets called
> 3. Destroy and reinit if a new subprocess is forked
>
> It also means we can't use pthread_once even in the pthread TLS
> implementation, since it doesn't provide those semantics.
>
> So I see two main alternatives here.
>
> Option 1: Modify the proposed PyThread_tss_create and PyThread_tss_delete
> APIs to accept a "bool *init_flag" pointer in addition to their current
> arguments.
>
> If *init_flag is true, then PyThread_tss_create is a no-op, otherwise it
> sets the flag to true after creating the key.
> If *init_flag is false, then PyThread_tss_delete is a no-op, otherwise it
> sets the flag to false after deleting the key.
>
> Option 2: Similar to option 1, but using a custom type alias, rather than
> using a C99 bool directly
>
> The closest API we have to these semantics at the moment would be
> PyGILState_Ensure, so the following API naming might work for option 2:
>
> Py_ensure_t
> Py_ENSURE_NEEDS_INIT
> Py_ENSURE_INITIALIZED
>
> Respectively, these would just be aliases for bool, false, and true.
>
> And then modify the proposed PyThread_tss_create and PyThread_tss_delete
> APIs to accept a "Py_ensure_t *init_flag" in addition to their current
> arguments.

That all sounds good--between the two option 2 looks a bit more explicit.

Though what about this?  Rather than adding another type, the original
proposal could be changed slightly so that Py_tss_t *is* partially
defined as a struct consisting of a bool, with whatever the native TLS
key is.   E.g.

typedef struct {
bool init_flag;
#if defined(_POSIX_THREADS)
pthreat_key_t key;
#elif defined (NT_THREADS)
DWORD key;
/* etc... */
} Py_tss_t;

Then it's just taking Masayuki's original patch, with the global bool
variables, and formalizing that by combining the initialized flag with
the key, and requiring the semantics you described above for
PyThread_tss_create/delete.

For Python's purposes it seems like this might be good enough, with
the more general purpose pthread_once-like functionality not required.

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Mon, Dec 19, 2016 at 3:45 PM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Mon, Dec 19, 2016 at 1:11 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
>> On 17 December 2016 at 03:51, Antoine Pitrou <solip...@pitrou.net> wrote:
>>>
>>> On Fri, 16 Dec 2016 13:07:46 +0100
>>> Erik Bray <erik.m.b...@gmail.com> wrote:
>>> > Greetings all,
>>> >
>>> > I wanted to bring attention to an issue that's been languishing on the
>>> > bug tracker since last year, which I think would best be addressed by
>>> > changes to CPython's C-API.  The original issue is at
>>> > http://bugs.python.org/issue25658, but I have made an effort below in
>>> > a sort of proto-PEP to summarize the problem and the proposed
>>> > solution.
>>> >
>>> > I haven't written this up in the proper PEP format because I want to
>>> > see if the idea has some broader support first, and it's also not
>>> > clear to me whether C-API changes (especially to undocumented APIs)
>>> > even require their own PEP.
>>>
>>> This is a nice detailed write-up and I'm in favour of the proposal.
>>
>>
>> Likewise - we know the status quo isn't right, and the proposed change
>> addresses that. In reviewing the patch on the tracker, the one downside I've
>> found is that due to "pthread_key_t" being an opaque type with no defined
>> sentinel, the consuming code in _tracemalloc.c and pystate.c needed to add
>> separate boolean flag variables to track whether or not the key had been
>> created. (The pthread examples at
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
>> use pthread_once for a similar effect)
>>
>> I don't see any obvious way around that either, as even using a small struct
>> for native pthread TLS keys would still face the problem of how to
>> initialise the pthread_key_t field.
>
> Hmm...fair point that it's not pretty.  One way around it, albeit
> requiring more work/complexity, would be to extend this proposal to
> add a new function analogous to pthread_once--say--PyThread_call_once,
> and an associated Py_once_flag_t

Oops--fat-fingered a 'send' command before I finished.

So  workaround would be to add a PyThread_call_once function,
analogous to pthread_once.  Yet another interface one needs to
implement for a native thread implementation, but not too hard either.
For pthreads there's already an obvious analogue that can be wrapped
directly.  For other platforms that don't have a direct analogue a
(naive) implementation is still fairly simple: All you need in
Py_once_flag_t is a boolean flag with an associated mutex, and a
sentinel value analogous to PTHREAD_ONCE_INIT.

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Mon, Dec 19, 2016 at 1:11 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 17 December 2016 at 03:51, Antoine Pitrou <solip...@pitrou.net> wrote:
>>
>> On Fri, 16 Dec 2016 13:07:46 +0100
>> Erik Bray <erik.m.b...@gmail.com> wrote:
>> > Greetings all,
>> >
>> > I wanted to bring attention to an issue that's been languishing on the
>> > bug tracker since last year, which I think would best be addressed by
>> > changes to CPython's C-API.  The original issue is at
>> > http://bugs.python.org/issue25658, but I have made an effort below in
>> > a sort of proto-PEP to summarize the problem and the proposed
>> > solution.
>> >
>> > I haven't written this up in the proper PEP format because I want to
>> > see if the idea has some broader support first, and it's also not
>> > clear to me whether C-API changes (especially to undocumented APIs)
>> > even require their own PEP.
>>
>> This is a nice detailed write-up and I'm in favour of the proposal.
>
>
> Likewise - we know the status quo isn't right, and the proposed change
> addresses that. In reviewing the patch on the tracker, the one downside I've
> found is that due to "pthread_key_t" being an opaque type with no defined
> sentinel, the consuming code in _tracemalloc.c and pystate.c needed to add
> separate boolean flag variables to track whether or not the key had been
> created. (The pthread examples at
> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
> use pthread_once for a similar effect)
>
> I don't see any obvious way around that either, as even using a small struct
> for native pthread TLS keys would still face the problem of how to
> initialise the pthread_key_t field.

Hmm...fair point that it's not pretty.  One way around it, albeit
requiring more work/complexity, would be to extend this proposal to
add a new function analogous to pthread_once--say--PyThread_call_once,
and an associated Py_once_flag_t
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Sat, Dec 17, 2016 at 8:21 AM, Stephen J. Turnbull
<turnbull.stephen...@u.tsukuba.ac.jp> wrote:
> Erik Bray writes:
>
>  > Abstract
>  > 
>  >
>  > The proposal is to add a new Thread Local Storage (TLS) API to CPython
>  > which would supersede use of the existing TLS API within the CPython
>  > interpreter, while deprecating the existing API.
>
> Thank you for the analysis!

And thank *you* for the feedback!

> Question:
>
>  > Further, the old PyThread_*_key* functions will be marked as
>  > deprecated.
>
> Of course, but:
>
>  > Additionally, the pthread implementations of the old
>  > PyThread_*_key* functions will either fail or be no-ops on
>  > platforms where sizeof(pythead_t) != sizeof(int).
>
> Typo "pythead_t" in last line.

Thanks, yes, that was suppose to be pthread_key_t of course.  I think
I had a few other typos too.

> I don't understand this.  I assume that there are no such platforms
> supported at present.  I would think that when such a platform becomes
> supported, code supporting "key" functions becomes unsupportable
> without #ifdefs on that platform, at least directly.  So you should
> either (1) raise UnimplementedError, or (2) provide the API as a
> wrapper over the new API by making the integer keys indexes into a
> table of TSS'es, or some such device.  I don't understand how (3)
> "make it a no-op" can be implemented for PyThread_create_key -- return
> 0 or -1?  That would only work if there's a failure return status like
> 0 or -1, and it seems really dangerous to me since in general a lot of
> code doesn't check status even though it should.  Even for code
> checking the status, the error message will be suboptimal ("creation
> failed" vs. "unimplemented").

Masayuki already explained this downthread I think, but I could have
probably made that section more precise.  The point was that
PyThread_create_key should immediately return -1 in this case.  This
is just a subtle difference over the current situation, which is that
PyThread_create_key succeeds, but the key is corrupted by being cast
to an int, so that later calls to PyThread_set_key_value and the like
fail unexpectedly.  The point is that PyThread_create_key (and we're
only talking about the pthread implementation thereof, to be clear)
must fail immediately if it can't work correctly.

#ifdefs on the platform would not be necessary--instead, Masayuki's
patch adds a feature check in configure.ac for sizeof(int) ==
sizeof(pthread_key_t).  It should be noted that even this check is not
100% perfect, as on Linux pthread_key_t is an unsigned int, and so
technically can cause Python's signed int key to overflow, but there's
already an explicit check for that (which would be kept), and it's
also a very unlikely scenario.

> I gather from references to casting pthread_key_t to unsigned int and
> back that there's probably code that does this in ways making (2) too
> dangerous to support.  If true, perhaps that should be mentioned here.

It's not necessarily too dangerous, so much as not worth the trouble,
IMO.  Simpler to just provide, and immediately use the new API and
make the old one deprecated and explicitly not supported on those
platforms where it can't work.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-19 Thread Erik Bray
On Sun, Dec 18, 2016 at 12:10 AM, Masayuki YAMAMOTO
 wrote:
> 2016-12-17 18:35 GMT+09:00 Stephen J. Turnbull
> :
>>
>> I don't understand this.  I assume that there are no such platforms
>> supported at present.  I would think that when such a platform becomes
>> supported, code supporting "key" functions becomes unsupportable
>> without #ifdefs on that platform, at least directly.  So you should
>> either (1) raise UnimplementedError, or (2) provide the API as a
>> wrapper over the new API by making the integer keys indexes into a
>> table of TSS'es, or some such device.  I don't understand how (3)
>> "make it a no-op" can be implemented for PyThread_create_key -- return
>> 0 or -1?  That would only work if there's a failure return status like
>> 0 or -1, and it seems really dangerous to me since in general a lot of
>> code doesn't check status even though it should.  Even for code
>> checking the status, the error message will be suboptimal ("creation
>> failed" vs. "unimplemented").
>
>
> PyThread_create_key has required user to check the return value since when
> key creation fails, returns -1 instead of valid key value.  Therefore, my
> patch changes PyThread_create_key that always return -1 on platforms that
> cannot cast key to int safely and current API never return valid key value
> to these platforms.  Its advantage to not change function specifications and
> no effect on supported platforms. Hence, this is reason that doesn't raise
> any exception on the API.
>
> (2) of ideas can enable current API on specific-platforms. If it's simple,
> I'd have liked to select it.  However, work that brings current API using
> native TLS to specific-platforms brings duplication implementation that
> manages keys, and it's ugly (same reason for Erik's draft, the last item of
> Rejected Ideas).  Thus, I gave up to keep feature and decided to implement
> "no-op", delegate error handling to API users.

Yep--I think it speaks to the sensibleness of that decision that I
pretty much read your mind :)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-16 Thread Erik Bray
Greetings all,

I wanted to bring attention to an issue that's been languishing on the
bug tracker since last year, which I think would best be addressed by
changes to CPython's C-API.  The original issue is at
http://bugs.python.org/issue25658, but I have made an effort below in
a sort of proto-PEP to summarize the problem and the proposed
solution.

I haven't written this up in the proper PEP format because I want to
see if the idea has some broader support first, and it's also not
clear to me whether C-API changes (especially to undocumented APIs)
even require their own PEP.


Abstract


The proposal is to add a new Thread Local Storage (TLS) API to CPython
which would supersede use of the existing TLS API within the CPython
interpreter, while deprecating the existing API.

Because the existing TLS API is only used internally (it is not
mentioned in the documentation, and the header that defines it,
pythread.h, is not included in Python.h either directly or
indirectly), this proposal probably only affects CPython, but might
also affect other interpreter implementations (PyPy?) that implement
parts of the CPython API.


Specification
=

The current API for TLS used inside the CPython interpreter consists
of 5 functions:

PyAPI_FUNC(int) PyThread_create_key(void)
PyAPI_FUNC(void) PyThread_delete_key(int key)
PyAPI_FUNC(int) PyThread_set_key_value(int key, void *value)
PyAPI_FUNC(void *) PyThread_get_key_value(int key)
PyAPI_FUNC(void) PyThread_delete_key_value(int key)

These would be superseded with a new set of analogous functions:

PyAPI_FUNC(int) PyThread_tss_create(Py_tss_t *key)
PyAPI_FUNC(void) PyThread_tss_delete(Py_tss_t key)
PyAPI_FUNC(int) PyThread_tss_set(Py_tss_t key, void *value)
PyAPI_FUNC(void *) PyThread_tss_get(Py_tss_t key)
PyAPI_FUNC(void) PyThread_tss_delete_value(Py_tss_t key)

and includes the definition of a new type Py_tss_t--any opaque type
the specification of which is not given here, and may depend on the
underlying TLS implementation.

The new PyThread_tss_ functions are almost exactly analogous to their
original counterparts with a minor difference:  Whereas
PyThread_create_key takes no arguments and returns a TLS key as an
int, PyThread_tss_create takes a Py_tss_t* as an argument, and returns
a Py_tss_t by pointer--the int return value is a status, returning
zero on success and non-zero on failure.

Further, the old PyThread_*_key* functions will be marked as
deprecated.  Additionally, the pthread implementations of the old
PyThread_*_key* functions will either fail or be no-ops on platforms
where sizeof(pythead_t) != sizeof(int).


Motivation
==

The primary problem at issue here is the type of the keys (int) used
for TLS values, as defined by the original PyThread TLS API.

The original TLS API was added to Python by GvR back in 1997, and at
the time the key used to represent a TLS value was an int, and so it
has been to this day.  This used CPython's own TLS implementation, the
current generation of which can still be found, largely unchanged, in
Python/thread.c.  Support for implementation of the API on top of
native thread implementations (NT and pthreads) was added much later,
and the built-in implementation may still be used on other platforms.

The problem with the choice of int to represent a TLS key, is that
while it was fine for CPython's internal TLS implementation, and
happens to be fine for NT (which uses DWORD), it is not compatible the
POSIX standard for the pthreads API, which defines pthread_key_t as an
opaque type not further designed by the standard (as with Py_tss_t
described above).  This leaves it up to the underlying implementation
how a pthread_key_t value is used to look thread-specific data.

This has not generally been a problem for Python's API, as it just
happens that on Linux pthread_key_t is just defined as an unsigned
int, and so is fully compatible with Python's TLS API--pthread_key_t's
created by pthread_create_key can be freely cast to ints and back
(well, not really, even this has issues as pointed out by issue
#22206).

However, as issue #25658 points out there are at least some platforms
(namely Cygwin, CloudABI, but likely others as well) which have
otherwise modern and POSIX-compliant pthreads implementations, but are
not compatible with Python's API because their pthread_key_t is
defined in a way that cannot be safely cast to int.  In fact, the
possibility of running into this problem was raised by MvL at the time
pthreads TLS was added [1].

It could be argued that PEP-11 makes specific requirements for
supporting a new, not otherwise officially-support platform (such as
CloudABI), and that the status of Cygwin support is currently dubious.
However, this places a very barrier to supporting platforms that are
otherwise Linux- and/or POSIX-compatible and where CPython might
otherwise "just work" except for this one hurdle which Python itself
imposes by way of an API that is not compatible with POSIX (and 

Re: [Python-ideas] if-statement in for-loop

2016-09-27 Thread Erik Bray
On Tue, Sep 27, 2016 at 5:33 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 28 September 2016 at 00:55, Erik Bray <erik.m.b...@gmail.com> wrote:
>> On Sun, Sep 11, 2016 at 12:28 PM, Bernardo Sulzbach
>> <mafagafogiga...@gmail.com> wrote:
>>> On 09/11/2016 06:36 AM, Dominik Gresch wrote:
>>>>
>>>> So I asked myself if a syntax as follows would be possible:
>>>>
>>>> for i in range(10) if i != 5:
>>>> body
>>>>
>>>> Personally, I find this extremely intuitive since this kind of
>>>> if-statement is already present in list comprehensions.
>>>>
>>>> What is your opinion on this? Sorry if this has been discussed before --
>>>> I didn't find anything in the archives.
>>>>
>>>
>>> I find it interesting.
>>>
>>> I thing that this will likely take up too many columns in more convoluted
>>> loops such as
>>>
>>> for element in collection if is_pretty_enough(element) and ...:
>>> ...
>>>
>>> However, this "problem" is already faced by list comprehensions, so it is
>>> not a strong argument against your idea.
>>
>> Sorry to re-raise this thread--I'm inclined to agree that the case
>> doesn't really warrant new syntax.  I just wanted to add that I think
>> the very fact that this syntax is supported by list comprehensions is
>> an argument *in its favor*.
>>
>> I could easily see a Python newbie being confused that they can write
>> "for x in y if z" inside a list comprehension, but not in a bare
>> for-statement.  Sure they'd learn quickly enough that the filtering
>> syntax is unique to list comprehensions.  But to anyone who doesn't
>> know the historical progression of the Python language that would seem
>> highly arbitrary and incongruous I would think.
>>
>> Just $0.02 USD from a pedagogical perspective.
>
> This has come up before, and it's considered a teaching moment
> regarding how the comprehension syntax actually works: it's an
> *arbitrarily deep* nested chain of if statements and for statements.
>
> That is:
>
>   [f(x,y,z) for x in seq1 if p1(x) for y in seq2 if p2(y) for z in
> seq3 if p3(z)]
>
> can be translated mechanically to the equivalent nested statements
> (with the only difference being that the loop variable leak due to the
> missing implicit scope):
>
> result = []
> for x in seq1:
> if p1(x):
> for y in seq2:
> if p2(y):
> for z in seq3:
> if p3(z):
> result.append(f(x, y, z))
>
> So while the *most common* cases are a single for loop (map
> equivalent), or a single for loop and a single if statement (filter
> equivalent), they're not only the forms folks may encounter in the
> wild.

Thanks for pointing this out Nick.  Then following my own logic it
would be desirable to also allow the nested for loop syntax of list
comprehensions outside them as well.  That's a slippery slope to
incomprehensibility (they're bad enough in list comprehensions, though
occasionally useful).

This is a helpful way to think about list comprehensions though--I'll
remember it next time I teach them.

Thanks,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] if-statement in for-loop

2016-09-27 Thread Erik Bray
On Sun, Sep 11, 2016 at 12:28 PM, Bernardo Sulzbach
 wrote:
> On 09/11/2016 06:36 AM, Dominik Gresch wrote:
>>
>> So I asked myself if a syntax as follows would be possible:
>>
>> for i in range(10) if i != 5:
>> body
>>
>> Personally, I find this extremely intuitive since this kind of
>> if-statement is already present in list comprehensions.
>>
>> What is your opinion on this? Sorry if this has been discussed before --
>> I didn't find anything in the archives.
>>
>
> I find it interesting.
>
> I thing that this will likely take up too many columns in more convoluted
> loops such as
>
> for element in collection if is_pretty_enough(element) and ...:
> ...
>
> However, this "problem" is already faced by list comprehensions, so it is
> not a strong argument against your idea.

Sorry to re-raise this thread--I'm inclined to agree that the case
doesn't really warrant new syntax.  I just wanted to add that I think
the very fact that this syntax is supported by list comprehensions is
an argument *in its favor*.

I could easily see a Python newbie being confused that they can write
"for x in y if z" inside a list comprehension, but not in a bare
for-statement.  Sure they'd learn quickly enough that the filtering
syntax is unique to list comprehensions.  But to anyone who doesn't
know the historical progression of the Python language that would seem
highly arbitrary and incongruous I would think.

Just $0.02 USD from a pedagogical perspective.

Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors

2016-08-31 Thread Erik Bray
On Tue, Aug 30, 2016 at 5:48 AM, Ken Kundert
<python-id...@shalmirane.com> wrote:
> Erik,
> One aspect of astropy.units that differs significantly from what I am
> proposing is that with astropy.units a user would explicitly specify the scale
> factor along with the units, and that scale factor would not change even if 
> the
> value became very large or very small. For example:
>
> >>> from astropy import units as u
> >>> d_andromeda = 7.8e5 * u.parsec
> >>> print(d_andromeda)
> 78.0 pc
>
> >>> d_sun = 93e6*u.imperial.mile
> >>> print(d_sun.to(u.parsec))
> 4.850441695494146e-06 pc
>
> >>> print(d_andromeda.to(u.kpc))
> 780.0 kpc
>
> >>> print(d_sun.to(u.kpc))
> 4.850441695494146e-09 kpc
>
> I can see where this can be helpful at times, but it kind of goes against the
> spirit of SI scale factors, were you are generally expected to 'normalize' the
> scale factor (use the scale factor that results in the digits presented before
> the decimal point falling between 1 and 999). So I would expected
>
> d_andromeda = 780 kpc
> d_sun = 4.8504 upc
>
> Is the normalization available astropy.units and I just did not find it?
> Is there some reason not to provide the normalization?
>
> It seems to me that pre-specifying the scale factor might be preferred if one 
> is
> generating data for a table and all the magnitude of the values are known in
> advance to within 2-3 orders of magnitude.
>
> It also seems to me that if these assumptions were not true, then normalizing
> the scale factors would generally be preferred.
>
> Do you believe that?

Hi Ken,

I see what you're getting at, and that's a good idea.  There's also
nothing in the current implementation preventing it, and I think I'll
even suggest this to Astropy (with proper attribution)!  I think there
are reasons not to always do this, but it's a nice option to have.

Point being nothing about this particular feature requires special
support from the language, unless I'm missing something obvious.  And
given that Astropy (or any other units library) is third-party chances
are a feature like this will land in place a lot faster than it has
any chance of showing up in Python :)

Best,
Erik

> On Mon, Aug 29, 2016 at 03:05:50PM +0200, Erik Bray wrote:
>> Astropy also has a very powerful units package--originally derived
>> from pyunit I think but long since diverged and grown:
>>
>> http://docs.astropy.org/en/stable/units/index.html
>>
>> It was originally developed especially for astronomy/astrophysics use
>> and has some pre-defined units that many other packages don't have, as
>> well as support for logarithmic units like decibel and optional (and
>> customizeable) unit equivalences (e.g. frequency/wavelength or
>> flux/power).
>>
>> That said, its power extends beyond astronomy and I heard through last
>> week's EuroScipy that even some biology people have been using it.
>> There's been some (informal) talk about splitting it out from Astropy
>> into a stand-alone package.  This is tricky since almost everything in
>> Astropy has been built around it (dimensional calculations are always
>> used where possible), but not impossible.
>>
>> One of the other big advantages of astropy.units is the Quantity class
>> representing scale+dimension values.  This is deeply integrated into
>> Numpy so that units can be attached to Numpy arrays, and all Numpy
>> ufuncs can operate on them in a dimensionally meaningful way.  The
>> needs for this have driven a number of recent features in Numpy.  This
>> is work that, unfortunately, could never be integrated into the Python
>> stdlib.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] A proposal to rename the term "duck typing"

2016-08-29 Thread Erik Bray
On Sun, Aug 28, 2016 at 7:41 PM, Bruce Leban  wrote:
>
>
> On Sunday, August 28, 2016, ROGER GRAYDON CHRISTMAN  wrote:
>>
>>
>> We have a term in our lexicon "duck typing" that traces its origins, in
>> part to a quote along the lines of
>> "If it walks like a duck, and talks like a duck, ..."
>>
>> ...
>>
>> In that case, it would be far more appropriate for use to call this sort
>> of type analysis "witch typing"
>
>
> I believe the duck is out of the bag on this one. First the "duck test" that
> you quote above is over 100 years old.
> https://en.m.wikipedia.org/wiki/Duck_test So that's entrenched.
>
> Second this isn't a Python-only term anymore and language is notoriously
> hard to change prescriptively.
>
> Third I think the duck test is more appropriate than the witch test which
> involves the testers faking the results.

Agreed.

It's also fairly problematic given that you're deriving the term from
a sketch about witch hunts.  While the Monty Python sketch is
hilarious and, it's the ignorant mob that's the butt of the joke
rather than the "witch", this joke doesn't necessarily play well
universally, especially given that there places today where women are
being killed for being "witches".

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors

2016-08-29 Thread Erik Bray
On Mon, Aug 29, 2016 at 3:05 PM, Erik Bray <erik.m.b...@gmail.com> wrote:
> On Mon, Aug 29, 2016 at 9:07 AM, Ken Kundert
> <python-id...@shalmirane.com> wrote:
>> On Mon, Aug 29, 2016 at 01:45:20PM +1000, Steven D'Aprano wrote:
>>> On Sun, Aug 28, 2016 at 08:26:38PM -0700, Brendan Barnwell wrote:
>>> > On 2016-08-28 18:44, Ken Kundert wrote:
>>> > >When working with a general purpose programming language, the above 
>>> > >numbers
>>> > >become:
>>> > >
>>> > > 780kpc -> 7.8e+05
>>> [...]
>>>
>>> For the record, I don't know what kpc might mean. "kilo pico speed of
>>> light"? So I looked it up using units, and it is kilo-parsecs. That
>>> demonstrates that unless your audience is intimately familiar with the
>>> domain you are working with, adding units (especially units that aren't
>>> actually used for anything) adds confusion.
>>>
>>> Python is not a specialist application targetted at a single domain. It
>>> is a general purpose programming language where you can expect a lot of
>>> cross-domain people (e.g. a system administrator asked to hack on a
>>> script in a domain they know nothing about).
>>
>> I talked to astrophysicist about your comments, and what she said was:
>> 1. She would love it if Python had built in support for real numbers with SI
>>scale factors
>> 2. I told her about my library for reading and writing numbers with SI scale
>>factors, and she was much less enthusiastic because using it would require
>>convincing the rest of the group, which would be too much effort.
>> 3. She was amused by the "kilo pico speed of light" comment, but she was 
>> adamant
>>that the fact that you, or some system administrator, does not understand
>>what kpc means has absolutely no affect on her desired to use SI scale
>>factors. Her comment: I did not write it for him.
>> 4. She pointed out that the software she writes and uses is intended either 
>> for
>>herself of other astrophysicists. No system administrators involved.
>
> Astropy also has a very powerful units package--originally derived
> from pyunit I think but long since diverged and grown:
>
> http://docs.astropy.org/en/stable/units/index.html
>
> It was originally developed especially for astronomy/astrophysics use
> and has some pre-defined units that many other packages don't have, as
> well as support for logarithmic units like decibel and optional (and
> customizeable) unit equivalences (e.g. frequency/wavelength or
> flux/power).
>
> That said, its power extends beyond astronomy and I heard through last
> week's EuroScipy that even some biology people have been using it.
> There's been some (informal) talk about splitting it out from Astropy
> into a stand-alone package.  This is tricky since almost everything in
> Astropy has been built around it (dimensional calculations are always
> used where possible), but not impossible.
>
> One of the other big advantages of astropy.units is the Quantity class
> representing scale+dimension values.  This is deeply integrated into
> Numpy so that units can be attached to Numpy arrays, and all Numpy
> ufuncs can operate on them in a dimensionally meaningful way.  The
> needs for this have driven a number of recent features in Numpy.  This
> is work that, unfortunately, could never be integrated into the Python
> stdlib.

I'll also add that syntactic support for units has rarely been an
issue in Astropy.  The existing algebraic rules for units work fine
with Python's existing order of operations.  It can be *nice* to be
able to write "1m" instead of "1 * m" but ultimately it doesn't add
much for clarity (and if really desired could be handled with a
preparser--something I've considered adding for Astropy sources (via
codecs).

Best,
Erik
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors

2016-08-29 Thread Erik Bray
On Mon, Aug 29, 2016 at 9:07 AM, Ken Kundert
 wrote:
> On Mon, Aug 29, 2016 at 01:45:20PM +1000, Steven D'Aprano wrote:
>> On Sun, Aug 28, 2016 at 08:26:38PM -0700, Brendan Barnwell wrote:
>> > On 2016-08-28 18:44, Ken Kundert wrote:
>> > >When working with a general purpose programming language, the above 
>> > >numbers
>> > >become:
>> > >
>> > > 780kpc -> 7.8e+05
>> [...]
>>
>> For the record, I don't know what kpc might mean. "kilo pico speed of
>> light"? So I looked it up using units, and it is kilo-parsecs. That
>> demonstrates that unless your audience is intimately familiar with the
>> domain you are working with, adding units (especially units that aren't
>> actually used for anything) adds confusion.
>>
>> Python is not a specialist application targetted at a single domain. It
>> is a general purpose programming language where you can expect a lot of
>> cross-domain people (e.g. a system administrator asked to hack on a
>> script in a domain they know nothing about).
>
> I talked to astrophysicist about your comments, and what she said was:
> 1. She would love it if Python had built in support for real numbers with SI
>scale factors
> 2. I told her about my library for reading and writing numbers with SI scale
>factors, and she was much less enthusiastic because using it would require
>convincing the rest of the group, which would be too much effort.
> 3. She was amused by the "kilo pico speed of light" comment, but she was 
> adamant
>that the fact that you, or some system administrator, does not understand
>what kpc means has absolutely no affect on her desired to use SI scale
>factors. Her comment: I did not write it for him.
> 4. She pointed out that the software she writes and uses is intended either 
> for
>herself of other astrophysicists. No system administrators involved.

Astropy also has a very powerful units package--originally derived
from pyunit I think but long since diverged and grown:

http://docs.astropy.org/en/stable/units/index.html

It was originally developed especially for astronomy/astrophysics use
and has some pre-defined units that many other packages don't have, as
well as support for logarithmic units like decibel and optional (and
customizeable) unit equivalences (e.g. frequency/wavelength or
flux/power).

That said, its power extends beyond astronomy and I heard through last
week's EuroScipy that even some biology people have been using it.
There's been some (informal) talk about splitting it out from Astropy
into a stand-alone package.  This is tricky since almost everything in
Astropy has been built around it (dimensional calculations are always
used where possible), but not impossible.

One of the other big advantages of astropy.units is the Quantity class
representing scale+dimension values.  This is deeply integrated into
Numpy so that units can be attached to Numpy arrays, and all Numpy
ufuncs can operate on them in a dimensionally meaningful way.  The
needs for this have driven a number of recent features in Numpy.  This
is work that, unfortunately, could never be integrated into the Python
stdlib.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/