Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Brendan Barnwell

On 2017-10-17 07:26, Serhiy Storchaka wrote:

17.10.17 17:06, Nick Coghlan пише:

>Keep in mind we're not talking about a regular loop you can break out of
>with Ctrl-C here - we're talking about a tight loop inside the
>interpreter internals that leads to having to kill the whole host
>process just to get out of it.

And this is the root of the issue. Just let more tight loops be
interruptible with Ctrl-C, and this will fix the more general issue.


	I was just thinking the same thing.  I think in general it's always bad 
for code to be uninterruptible with Ctrl-C.  If these infinite iterators 
were fixed so they could be interrupted, this containment problem would 
be much less painful.


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Koos Zevenhoven
On Tue, Oct 17, 2017 at 5:26 PM, Serhiy Storchaka 
wrote:

> 17.10.17 17:06, Nick Coghlan пише:
>
>> Keep in mind we're not talking about a regular loop you can break out of
>> with Ctrl-C here - we're talking about a tight loop inside the interpreter
>> internals that leads to having to kill the whole host process just to get
>> out of it.
>>
>
> And this is the root of the issue. Just let more tight loops be
> interruptible with Ctrl-C, and this will fix the more general issue.
>
>
​Not being able to interrupt something with Ctrl-C in the repl or with the
interrupt command in Jupyter notebooks is definitely a thing I sometimes
encounter. A pity I don't remember when it happens, because I usually
forget it very soon after I've restarted the kernel and continued working.
But my guess is it's usually not because of an infinite iterator.

Regarding what the OP might have been after, and just for some wild
brainstorming based on true stories: In some sense, x in y should always
have an answer, even if it may be expensive to compute. Currently, it's
possible to implement "lazy truth values" which compute the bool value
lazily when .__bool__() is called. Until you call bool(..) on it, it would
just be Maybe, and then after the call, you'd actually have True or False.
In many cases it can even be enough to know if something is Maybe true.
Also, if you do something like any(*truth_values), then you could skip the
Maybe ones on the first pass, because if you find one that's plain True,
you already have the answer.

Regarding `x in y`, where y is an infinite iterable without well defined
contents, that would return an instance of MaybeType, but .__bool__() would
raise an exception.

––Koos


-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Serhiy Storchaka

17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of 
with Ctrl-C here - we're talking about a tight loop inside the 
interpreter internals that leads to having to kill the whole host 
process just to get out of it.


And this is the root of the issue. Just let more tight loops be 
interruptible with Ctrl-C, and this will fix the more general issue.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Nick Coghlan
On 17 October 2017 at 23:17, Koos Zevenhoven  wrote:

> On Tue, Oct 17, 2017 at 2:46 PM, Serhiy Storchaka 
> wrote:
>
>> 17.10.17 14:10, Nick Coghlan пише:
>>
>>> 1. It's pretty easy to write "for x in y in y" when you really meant to
>>> write "for x in y", and if "y" is an infinite iterator, the "y in y" part
>>> will become an unbreakable infinite loop when executed instead of the
>>> breakable one you intended (especially annoying if it means you have to
>>> discard and restart a REPL session due to it, and that's exactly where that
>>> kind of typo is going to be easiest to make)
>>>
>>
>> I think it is better to left this on linters.
>
>
> ​Just to note that there is currently nothing that would prevent making
> `for x in y in z`​ a syntax error. There is nothing meaningful that it
> could do, really, because y in z can only return True or False (or raise an
> Exception or loop infinitely).
>

That was just an example of one of the ways we can accidentally end up
writing "x in y" at the REPL, where "y" is an infinite iterator, since it's
the kind that's specific to "x in y", whereas other forms (like
accidentally using the wrong variable name) also apply to other iterator
consuming APIs (like the ones Serhiy mentioned).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Nick Coghlan
On 17 October 2017 at 21:46, Serhiy Storchaka  wrote:

> 17.10.17 14:10, Nick Coghlan пише:
>
>> 1. It's pretty easy to write "for x in y in y" when you really meant to
>> write "for x in y", and if "y" is an infinite iterator, the "y in y" part
>> will become an unbreakable infinite loop when executed instead of the
>> breakable one you intended (especially annoying if it means you have to
>> discard and restart a REPL session due to it, and that's exactly where that
>> kind of typo is going to be easiest to make)
>>
>
> I think it is better to left this on linters. I never encountered this
> mistake and doubt it is common. In any case the first execution of this
> code will expose the mistake.
>

People don't run linters at the REPL, and it's at the REPL where
accidentally getting an unbreakable infinite loop is most annoying.

Keep in mind we're not talking about a regular loop you can break out of
with Ctrl-C here - we're talking about a tight loop inside the interpreter
internals that leads to having to kill the whole host process just to get
out of it.


> 2. Containment testing already has a dedicated protocol so containers can
>> implement optimised containment tests, which means it's also trivial for an
>> infinite iterator to intercept and explicitly disallow containment checks
>> if it chooses to do so
>>
>
> But this has non-zero maintaining cost. As the one who made many changes
> in itertools.c I don't like the idea of increasing its complexity for
> optimizing a pretty rare case.
>

It's not an optimisation, it's a UX improvement for the interactive prompt.
The maintenance burden should be low, as it's highly unlikely we'd ever
need to change this behaviour again in the future (I do think deprecating
the success case would be more trouble than it would be worth though).


> And note that the comparison can have side effect. You can implement the
> optimization of `x in count()` only for the limited set of builtin types.
> For example `x in range()` is optimized only for exact int and bool. You
> can't guarantee the finite time for cycle() and repeat() either since they
> can emit values of arbitrary types, with arbitrary __eq__.


We're not trying to guarantee finite execution time in general, we're just
making it more likely that either Ctrl-C works, or else you don't get stuck
in an infinite loop in the first place.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Koos Zevenhoven
On Tue, Oct 17, 2017 at 2:46 PM, Serhiy Storchaka 
wrote:

> 17.10.17 14:10, Nick Coghlan пише:
>
>> 1. It's pretty easy to write "for x in y in y" when you really meant to
>> write "for x in y", and if "y" is an infinite iterator, the "y in y" part
>> will become an unbreakable infinite loop when executed instead of the
>> breakable one you intended (especially annoying if it means you have to
>> discard and restart a REPL session due to it, and that's exactly where that
>> kind of typo is going to be easiest to make)
>>
>
> I think it is better to left this on linters.


​Just to note that there is currently nothing that would prevent making
`for x in y in z`​ a syntax error. There is nothing meaningful that it
could do, really, because y in z can only return True or False (or raise an
Exception or loop infinitely).

But for an infinite iterable, the right answer may be Maybe ;)

​––Koos​



> I never encountered this mistake and doubt it is common. In any case the
> first execution of this code will expose the mistake.
>
> 2. Containment testing already has a dedicated protocol so containers can
>> implement optimised containment tests, which means it's also trivial for an
>> infinite iterator to intercept and explicitly disallow containment checks
>> if it chooses to do so
>>
>
> But this has non-zero maintaining cost. As the one who made many changes
> in itertools.c I don't like the idea of increasing its complexity for
> optimizing a pretty rare case.
>
> And note that the comparison can have side effect. You can implement the
> optimization of `x in count()` only for the limited set of builtin types.
> For example `x in range()` is optimized only for exact int and bool. You
> can't guarantee the finite time for cycle() and repeat() either since they
> can emit values of arbitrary types, with arbitrary __eq__.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Serhiy Storchaka

17.10.17 14:10, Nick Coghlan пише:
1. It's pretty easy to write "for x in y in y" when you really meant to 
write "for x in y", and if "y" is an infinite iterator, the "y in y" 
part will become an unbreakable infinite loop when executed instead of 
the breakable one you intended (especially annoying if it means you have 
to discard and restart a REPL session due to it, and that's exactly 
where that kind of typo is going to be easiest to make)


I think it is better to left this on linters. I never encountered this 
mistake and doubt it is common. In any case the first execution of this 
code will expose the mistake.


2. Containment testing already has a dedicated protocol so containers 
can implement optimised containment tests, which means it's also trivial 
for an infinite iterator to intercept and explicitly disallow 
containment checks if it chooses to do so


But this has non-zero maintaining cost. As the one who made many changes 
in itertools.c I don't like the idea of increasing its complexity for 
optimizing a pretty rare case.


And note that the comparison can have side effect. You can implement the 
optimization of `x in count()` only for the limited set of builtin 
types. For example `x in range()` is optimized only for exact int and 
bool. You can't guarantee the finite time for cycle() and repeat() 
either since they can emit values of arbitrary types, with arbitrary __eq__.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Nick Coghlan
On 17 October 2017 at 19:19, Serhiy Storchaka  wrote:

> 17.10.17 09:42, Nick Coghlan пише:
>
>> On 17 October 2017 at 16:32, Nick Coghlan > ncogh...@gmail.com>> wrote:
>>
>> So this sounds like a reasonable API UX improvement to me, but you'd
>> need to ensure that you don't inadvertently change the external
>> behaviour of *successful* containment tests.
>>
>>
>> I should also note that there's another option here beyond just returning
>> "False": it would also be reasonable to raise an exception like
>> "RuntimeError('Attempted negative containment check on infinite iterator')".
>>
>
> What about other operations with infinite iterators? min(count()),
> max(count()), all(count(1))? Do you want to implement special cases for all
> of them?


No, as folks expect those to iterate without the opportunity to break out,
and are hence more careful with them when infinite iterators are part of
their application. We also don't have any existing protocols we could use
to intercept them, even if we decided we *did* want to do so.

The distinction I see with "x in y" is:

1. It's pretty easy to write "for x in y in y" when you really meant to
write "for x in y", and if "y" is an infinite iterator, the "y in y" part
will become an unbreakable infinite loop when executed instead of the
breakable one you intended (especially annoying if it means you have to
discard and restart a REPL session due to it, and that's exactly where that
kind of typo is going to be easiest to make)
2. Containment testing already has a dedicated protocol so containers can
implement optimised containment tests, which means it's also trivial for an
infinite iterator to intercept and explicitly disallow containment checks
if it chooses to do so

So the problem is more likely to be encountered due to "x in y" appearing
in both the containment test syntax and as part of the iteration syntax,
*and* it's straightforward to do something about it because the
__contains__ hook already exists. Those two points together are enough for
me to say "Sure, it makes sense to replace the current behaviour with
something more user friendly".

If either of them was false, then I'd say "No, that's not worth the hassle
of changing anything".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Nick Coghlan
On 17 October 2017 at 17:44, Steven D'Aprano  wrote:

> On Tue, Oct 17, 2017 at 04:42:35PM +1000, Nick Coghlan wrote:
>
> > I should also note that there's another option here beyond just returning
> > "False": it would also be reasonable to raise an exception like
> > "RuntimeError('Attempted negative containment check on infinite
> iterator')".
>
> I don't think that works, even if we limit discussion to just
> itertools.count() rather than arbitrary iterators. Obviously we
> cannot wait until the entire infinite iterator is checked (that
> might take longer than is acceptible...) but if you only check a
> *finite* number before giving up, you lead to false-negatives:
>
> # say we only check 100 values before raising
> 0 in itertools.count(1)  # correctly raises
> 101 in itertools.count(1)  # wrongly raises
>

Nobody suggested that, as it's obviously wrong. This discussion is solely
about infinite iterators that have closed form containment tests, either
because they're computed (itertools.count()), or because they're based on
an underlying finite sequence of values (cycle(), repeat()).


> If we do a computed membership test, then why raise at all? We quickly
> know whether or not the value is in the sequence, so there's no error to
> report.
>

Because we should probably always be raising for these particular
containment checks, and it's safe to start doing so in the negative case,
since that's currently a guaranteed infinite loop.

And unlike a "while True" loop (which has many real world applications),
none of these implicit infinite loops allow for any kind of useful work on
each iteration, they just end up in a tight loop deep inside the
interpreter internals, doing absolutely nothing. They won't even check for
signals or release the GIL, so you'll need to ask the operating system to
clobber the entire process to break out of it - Ctrl-C will be ignored.

I'd also have no major objection to deprecating containment tests on these
iterators entirely, but that doesn't offer the same kind of UX benefit that
replacing an infinite loop with an immediate exception does, so I think the
two questions should be considered separately.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Serhiy Storchaka

17.10.17 09:42, Nick Coghlan пише:
On 17 October 2017 at 16:32, Nick Coghlan 
> wrote:


So this sounds like a reasonable API UX improvement to me, but you'd
need to ensure that you don't inadvertently change the external
behaviour of *successful* containment tests.


I should also note that there's another option here beyond just 
returning "False": it would also be reasonable to raise an exception 
like "RuntimeError('Attempted negative containment check on infinite 
iterator')".


What about other operations with infinite iterators? min(count()), 
max(count()), all(count(1))? Do you want to implement special cases for 
all of them?


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why not picoseconds?

2017-10-17 Thread Koos Zevenhoven
Replying to myself again here, as nobody else said anything:

On Mon, Oct 16, 2017 at 5:42 PM, Koos Zevenhoven  wrote:
>
>
> ​Indeed. And some more on where the precision loss comes from:
>
> When you measure time starting from one point, like 1970, the timer
> reaches large numbers today, like 10**9 seconds. Tiny fractions of a second
> are especially tiny when compared to a number like that.
>
> You then need log2(10**9) ~ 30 bits of precision just to get a one-second
> resolution in your timer. A double-precision (64bit) floating point number
> has 53 bits of precision in the mantissa, so you end up with 23 bits of
> precision left for fractions of a second, which means you get a resolution
> of 1 / 2**23 seconds, which is about 100 ns, which is well in line with the
> data that Victor provided (~100 ns + overhead = ~200 ns).
>
>
​My calculation is indeed *approximately* correct, but ​the problem is that
I made a bunch of decimal rounding errors while doing it, which was not
really desirable here. The exact expression for the resolution of
time.time() today is:

>>> 1 / 2**(53 - math.ceil(math.log2(time.time(
​2.384185791015625e-07

So this is in fact a little over 238 ns. Victor got 239 ns experimentally.
So actually the resolution is coarse enough to completely drown the the
effects of overhead in Victor's tests, and now that the theory is done
correctly, it is completely in line with practice.

––Koos

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Membership of infinite iterators

2017-10-17 Thread Steven D'Aprano
On Tue, Oct 17, 2017 at 04:42:35PM +1000, Nick Coghlan wrote:

> I should also note that there's another option here beyond just returning
> "False": it would also be reasonable to raise an exception like
> "RuntimeError('Attempted negative containment check on infinite iterator')".

I don't think that works, even if we limit discussion to just 
itertools.count() rather than arbitrary iterators. Obviously we 
cannot wait until the entire infinite iterator is checked (that 
might take longer than is acceptible...) but if you only check a 
*finite* number before giving up, you lead to false-negatives:

# say we only check 100 values before raising
0 in itertools.count(1)  # correctly raises
101 in itertools.count(1)  # wrongly raises

If we do a computed membership test, then why raise at all? We quickly 
know whether or not the value is in the sequence, so there's no error to 
report.

Personally, I think a better approach is to give the specialist 
itertools iterator types a __contains__ method which unconditionally 
raises a warning regardless of whether the containment test returns 
true, false or doesn't return at all. Perhaps with a flag (module-wide?) 
to disable the warning, or turn it into an error.

I think a warning (by default) is better than an error because we don't 
really know for sure that it is an error:

n in itertools.count()

is, on the face of it, no more than an error than any other 
potentially infinite loop:

while condition(n):
...

and like any so-called infinite loop, we can never be sure when to give 
up and raise. A thousand loops? A million? A millisecond? An hour? 
Whatever we pick, it will be a case of one-size fits none.

I appreciate that, in practice it is easier to mess up a containment 
test using one of the itertools iterators than to accidentally write an 
infinite loop using while, and my concession to that is to raise a 
warning, and let the programmer decide whether to ignore it or turn it 
into an error.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/