-cc: python-steering-council

On Fri, Mar 5, 2021 at 4:26 PM Guido van Rossum <gu...@python.org> wrote:

> On Fri, Mar 5, 2021 at 11:11 AM Brett Cannon <br...@python.org> wrote:
>
>> Speaking for myself ...
>>
>
> Ditto ...
>
> On Fri, Mar 5, 2021 at 7:04 AM Mark Shannon <m...@hotpy.org> wrote:
>> [...]
>>
>>> In some cases, the PEP would have improved the situation.
>>>
>>> For example:
>>> sys.setrecursionlimit(5000)
>>> def f():
>>>      f()
>>>
>>> Currently, it raises a RecursionError on linux, but crashes the
>>> interpreter on Windows.
>>> With PEP 651 it would have raised a RecursionError on both platforms.
>>>
>>> Am I missing something here?
>>>
>>
>> So your example shows a user already comfortable in raising their
>> recursion limit to work around needing more stack space to reach
>> completion. What is stopping the user from continuing to raise the limit
>> until they still reach their memory limit even with PEP 651? If you're
>> worried about runaway recursion you will very likely hit that with the
>> default stack depth already, so I personally don't see how a decoupled
>> stack counter from the C stack specifically makes it any easier/better to
>> detect runaway recursion. And if I need more recursion than the default,
>> you're going to bump the recursion depth anyway, which weakens the
>> protection in either the C or decoupled counter scenarios. Sure, it's
>> currently platform-specific, but plenty of people want to push that limit
>> based on their machine anyway and don't need consistency on platforms they
>> will never run on, i.e. I don't see a huge benefit to being able to say
>> that an algorithm consistently won't go past 5000 calls on all platforms
>> compared to what the C stack protection already gives us (not to say
>> there's zero benefit, but it isn't massive or widespread either IMO). I
>> personally just don't see many people saying, "I really want to limit my
>> program to an exact call stack depth of 5000 on all platforms which is
>> beyond the default, but anything under is fine and anything over --
>> regardless of what the system can actually handle -- is unacceptable".
>>
>> Tack on the amount of changes required to give a cross-platform stack
>> count and limit check compared to the benefit being proposed, and to me
>> that pushes what the PEP is proposing into net-negative payoff.
>>
>
> To me, the point of that example is as a reminder that currently fiddling
> with the recursion limit can cause segfaults.
>
> Mark's PEP proposes two, somewhat independent, changes: (1) don't consume
> C stack on pure Python-to-Python (pp2p) function calls; (2) implement
> fool-proof C stack overflow checks.
>
> Change (2) makes it safe for users to mess with the stack overflow limit
> however they see fit. Despite (1), the limit for pp2p calls remains at 1000
> so that users who unintentionally write some naively recursive code don't
> have to wait until they fill up all of memory before they get a traceback.
> (Of course they could also write a while-True loop that keeps adding an
> item to a list and they'd end up in the same situation. But in my
> experience that situation is less painful to deal with than accidental
> stack overflow, and I'd shudder at the thought of a traceback of a million
> lines.)
>
> Given that we have (1), why is (2) still needed? Because there are ways to
> recursively call Python code that aren't pp2p calls. By a pp2p (pure
> Python-to-Python) call, I mean any direct call, e.g. a method call or a
> function call. But what about other operations that can invoke Python code?
> E.g. if we have a dict d and a class C, we could create an instance of C
> and use it to index d, e.g. d[C()]. This operation is not a p2pp call --
> the BINARY_SUBSCR opcode calls the dict's `__getitem__` method, and that
> calls the key's `__hash__` method. Here's a silly demonstration:
> ```
> def f(c):
>     d = {}
>     return d[c]
>
> class C:
>     def __hash__(self):
>         return f(self)
>
> f(C())
> ```
> Note that the "traceback compression" implemented for simple recursive
> calls fails here -- I just ran this code and got 2000 lines of output.
>
> The way I imagine Mark wants to implement pp2p calls means that in this
> case each recursion step *does* add several other C stack frames, and this
> would be caught by the limit implemented in (2). I see no easy way around
> this -- after all the C code involved in the recursion could be a piece of
> 3rd party C code that itself is not at fault.
>
> So we could choose to implement only (2), robust C stack overflow checks.
> This would require a bunch of platform-specific code, and there might be
> platforms where we don't know how to implement this (I vaguely recall a
> UNIX version where main() received a growable stack but each thread only
> had a fixed 64kb stack), but those would be no worse off than today.
>
> Or we could choose to implement only (1), eliminating C stack usage for
> pp2p calls. But in that case we'd still need a recursion limit for non-pp2p
> calls. We could eliminate the recursion limit entirely for pp2p calls, but
> then we'd probably end up with user complaints -- I could imagine a
> framework that executes arbitrary user code and attempts recovery after
> receiving an exception (for example a REPL). Such a framework can never be
> 100% foolproof (e.g. `while True: l.append(1)`, or `os._exit(0)`), but
> accidental recursion would be a reasonable thing to attempt to curtail, so
> we'd still need to implement a limit on pp2p calls. Since the new pp2p
> implementation allows for a much larger recursion limit, Mark's design with
> two separate limits and exceptions makes sense -- "recursion" to limit pp2p
> call recursion, and "stack overflow" to limit the C stack overflow. To make
> things more reliable, it makes sense to throw in (2), better checks for C
> stack overflow, but that could be done independently -- we could just
> implement the two separate limits and exceptions and C-stack-frame-free
> pp2p calls without (2), and we'd be no worse off than today with regard to
> the safety of "wild" recursion like my example above.
>

*(Resurrecting a 9 month old thread a bit as C stack overflows came up in
the context of https://github.com/python/cpython/pull/30855
<https://github.com/python/cpython/pull/30855>)*

Regarding (2)... robust C stack overflow checks are hard - if even
possible. At most you can get an improvement but never be able to guarantee
it. What I've seen proposed for that in this thread I believe won't deliver
in some real scenarios. Any code can spawn threads from extension modules
or embedding code with whatever C stack size it wants. Those threads can
(read: do) call back into the CPython interpreter. Any assumption of a
constant minimum C stack size is incorrect unless you go with the absolute
platform minimum possible value (too small). 512k is considered huge.

One way other language runtimes deal with stack overflows is via a
sigaltstack'd SIGSEGV handler that introspects what caused the signal (!)
to see if it was maybe a stack overflow and if so fixes things up (hello
Golang dynamic stack implementation) to recover before resuming execution.
This is difficult and potentially not even possible to guarantee correct
when you don't control the entire runtime as Golang does.

There has been some interesting talk about techniques recently in The
Orange Site's https://news.ycombinator.com/item?id=28682945 thread which
has cropped up since this discussion. The idea around allocating our own C
stack and swapping over to that is "neat" - but it'd be a challenge as
we're a re-entrant runtime so we'd effectively need to grow a list of newly
allocated C stacks for use upon every entry where we find one of our
thread's stacks is not the one in use. I suspect such an implementation
would probably be more complex than the feature is worth - but I'll leave
judgment on that up to after anyone actually manages to implement a cross
platform version of such an idea.

-gps
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W6PODCUS6ZCA5PH3XULTVJNKNXB3DNS6/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to