[Python-Dev] Re: bug(?) - unexpected frames being skipped in extract_stack with closures
Steven, Yes and I posted to python-dev for a reason - I'm almost positive that this is a bug - or at least an inconsistency - in how python handles stack frames WRT closures in some instances. In fact, the reason I posted is because we hit this inconsistency in handling production code - we need to have a reliable stack trace for all the functions we call in logs so we can better track down issues when they occur and be able to tie those issues to underlying code. If I add a pdb.set_trace() to the location inside the closure, I get a different - in fact the correct - stack trace. Otherwise, like I said, the stack trace points back to the place where the closure was defined, not the actual place the closure was called. Unfortunately, it looks like this bug is not in a simple example that I can readily reproduce (I just tried). so if I see it again i'll try to simplify it to a point where it still manifests and post that. Ed On Fri, Jun 21, 2019 at 1:35 AM Steve Holden wrote: > > Hi Ed, > > Your note probably won't receive any other reply than this, because the > python-dev list is specifically for discussions about the development _of_, > rather than _with_, Python. > > A more appropriate forum is probably the Python list > (python-l...@python.org), about which you can discover more details at > Python-list Info Page. > > Kind regards, > Steve Holden > > > On Thu, Jun 20, 2019 at 3:40 AM Ed Peschko wrote: >> >> all, >> >> I'm writing a function meant to print out the context of a given >> function call when executed - for example: >> >> 1. def main(): >> 2. >> 3. _st = stack_trace_closure("/path/to/log") >> 4. _st() >> 5. _st() >> >> would print out >> >> /path/to/file.py:4 >> /path/to/file.py:5 >> >> for each line when executed. Basic idea is to create a closure and >> associate that closure with a filename, then run that closure to print >> to the log without needing to give the filename over and over again. >> >> So far so good. But when I write this function, the frames given by >> getframeinfo or extract_stack skip the actual calling point of the >> function, instead giving back the *point where the closure was >> defined*. (in the above example, it would print /path/to/file.py:3, >> /path/to/file.py:3 instead of incrementing to show 4 and 5). >> >> However, when I insert a pdb statement, it gives me the expected >> calling frame where _st is actually called. >> >> What's going on here? It looks an awful lot like a bug to me, like an >> extra frame is being optimized out of of the closure's stack >> prematurely. >> >> I've tried this in python2.7 and python3.3, both show this. >> >> thanks much for any info, >> >> Ed >> >> code follows: >> --- >> >> def stack_trace_closure(message, file_name=None, frame=3): >> >> fh = open(file_name, "w+") >> >> def _helper(): >> return stack_trace(message, frame, fh) >> >> return _helper >> >> def stack_trace(message _frame, fh): >> >> _bt = traceback.extract_stack() >> >> fh.write( "%s:%s - %s" % (_bt[_frame][0], _bt[_frame][1], _message)) >> ___ >> Python-Dev mailing list -- python-dev@python.org >> To unsubscribe send an email to python-dev-le...@python.org >> https://mail.python.org/mailman3/lists/python-dev.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-dev@python.org/message/4MKHPCRNAJACKIBMLILMQMUPTEVFD3HW/ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DQBKRUI5ZMU6F3JHIVIZKT32DBOOQOLQ/
[Python-Dev] bug(?) - unexpected frames being skipped in extract_stack with closures
all, I'm writing a function meant to print out the context of a given function call when executed - for example: 1. def main(): 2. 3. _st = stack_trace_closure("/path/to/log") 4. _st() 5. _st() would print out /path/to/file.py:4 /path/to/file.py:5 for each line when executed. Basic idea is to create a closure and associate that closure with a filename, then run that closure to print to the log without needing to give the filename over and over again. So far so good. But when I write this function, the frames given by getframeinfo or extract_stack skip the actual calling point of the function, instead giving back the *point where the closure was defined*. (in the above example, it would print /path/to/file.py:3, /path/to/file.py:3 instead of incrementing to show 4 and 5). However, when I insert a pdb statement, it gives me the expected calling frame where _st is actually called. What's going on here? It looks an awful lot like a bug to me, like an extra frame is being optimized out of of the closure's stack prematurely. I've tried this in python2.7 and python3.3, both show this. thanks much for any info, Ed code follows: --- def stack_trace_closure(message, file_name=None, frame=3): fh = open(file_name, "w+") def _helper(): return stack_trace(message, frame, fh) return _helper def stack_trace(message _frame, fh): _bt = traceback.extract_stack() fh.write( "%s:%s - %s" % (_bt[_frame][0], _bt[_frame][1], _message)) ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4MKHPCRNAJACKIBMLILMQMUPTEVFD3HW/
Re: [Python-Dev] short-circuiting runtime errors/exceptions in the python debugger.
Steve, thanks for the response, and yes, I've experimented with reverse debugging, and yes for the reasons specified in that article you gave it isn't really practical with anything but small projects because of the massive amounts of memory usage. But that's really not what I'm asking for here. I'm simply asking that python give developers the opportunity to stop execution and open up a debugger *before* the exception is hit, and be able to instrument the code at that point using that debugger. The way that I mimic this in perl is by wrapping places where die, confess, croak, AUTOLOAD, etc are hit with something that looks like this: sub _confess { $DB::single = 1; my $trap = 1; if ($trap) { confess(@_); } } that way, I can short-circuit evals before they happen and save the state to both examine and modify. It's a hack, but it's a manageable hack since there are only so many places where things can go off the rails in perl, and I cover each exit. I can set $trap = 0 and lo-and-behold it will continue as if no trap was issued. Now, if there is global reverse debugging implemented in python, this has got to be doable as well. I'd dare say it could be done with very little overhead in memory. > You know you can set breakpoints in the debugger? You don't have to > single-step all the way through from the beginning. It isn't clear from > your post how experienced you are. ok just to be on the totally clear side, yes I'm quite aware of pdb.set_trace(), etc. They however don't do much to help the pain of debugging and maintaining code which is *not your own*. For all good they are, you might as well step through each line because you have no clue where the mines in a particular script or library reside, and when you are going to hit them. And of course depending on how good the debugging implementation is, when you hit these mines they may be well-nigh impossible to remove without something like this. Which btw is the main reason WHY I'm asking for this, and this is why I think it would be such a productivity booster. Not for my code which I control, but for others. I'll take one particular experience here to make my point. I've been experimenting with an automation framework written in python here that shall otherwise remain nameless (except it is a fairly popular one). Its ssh layer is *notoriously* obtuse, it freezes up on multiple occasions,the support lists are clogged with issues surrounding it, yet release-after-release things don't get fixed. Why? I'd argue because the code is hard to parse but even harder to debug. Having something like this would be a great boon to fixing any such issues. So that's basically it. No - I'm not looking for a full-blown reverse debugger for the reasons you state. I'm looking for something more limited and therefore more workable. Ed ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] short-circuiting runtime errors/exceptions in python debugger.
all, I was debugging a very long script that I was not all that familiar with, and I was doing my familiar routine of being very careful in evaluating expressions to make sure that I didn't hit such statements as: TypeError: unsupported operand type(s) for +: 'int' and 'str' anyways the script has a runtime of hours, so this was tedious work, and I hit one-too-many times where I missed a condition and had to start all over again. So that got me thinking: the main problem with exceptions and runtime errors is that they short-circuit the program context. You have this context, but you can't change it to avoid the failure; ie: with 1. aa = 'a' 2. bb = 1 + a 3. print bb you'll never get to line 3 if you go through line 2. If you are lucky you can catch it, modify aa to be an integer to continue, otherwise your only recourse is to start over again. So I was wondering if it would be possible to keep that context around if you are in the debugger and rewind the execution point to before the statement was triggered. so you could say: python script.py (Pdb) c then hit the exception, say: (Pdb) aa = 2 then run again (Pdb) c to work your way through the exception point. You could then fix the script inline in an editor. I can't emphasize exactly how much time and effort this would save. At best, debugging these types of issues is annoying, at worst it is excruciating, because they are sometimes intermittent and are not easily repeatable. Using RemotePdb or Pdb to attach to a long-running process and having the assurance that the underlying script won't die because of an inane coding error would do wonders for the reliability and integrity of scripting. so - is this possible? just from my experiments it doesn't look so, but perhaps there is a trick out there that would give this functionality.. if it isn't possible, how easy would it be to implement? Thanks much, Ed ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \G (match last position) regex operator non-existant in python?
> From this I understand that when using e.g. findall() it forces successive > matches to be adjacent. yes, I admit that this is a clearer description of what \G does. My only defense is that I wrote my description when it was late. :) I can only stress how useful it is, especially for debugging regexes. Basically if you are cutting up any string into discrete chunks, you want to make sure that you aren't missing any chunks in the middle when you do the cut. without \G, you can miss large sections of string, and it is easy to overlook. with \G, you are guaranteed to see exactly where your regex falls down. In addition, there are specific regexes that you can only write with \G (eg. c parsers) Anyways, I'll look at regex. On Fri, Oct 27, 2017 at 8:35 AM, Guido van Rossum wrote: > The "why" question is not very interesting -- it probably wasn't in PCRE and > nobody was familiar with it when we moved off PCRE (maybe it wasn't even in > Perl at the time -- it was ~15 years ago). > > I didn't understand your description of \G so I googled it and found a > helpful StackOverflow article: > https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. > From this I understand that when using e.g. findall() it forces successive > matches to be adjacent. > > In general this seems to be a unique property of \G: it preserves *state* > from one match to the next. This will make it somewhat difficult to > implement -- e.g. that state should probably be thread-local in case > multiple threads use the same compiled regex. It's also unclear when that > state should be reset. (Only when you compile the regex? Each time you pass > it a different source string?) > > So I'm not sure it's reasonable to add. But I also don't see a reason why it > shouldn't be added -- presuming we can decide on good answer for the > questions above about the "scope" of the anchor. > > I think it's okay to start a discussion on bugs.python.org about the precise > specification of \G for Python. OTOH I expect that most core devs won't find > this a very interesting problem (Python relies on regexes for parsing a lot > less than Perl does). > > Good luck! > > On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko wrote: >> >> All, >> >> perl has a regex assertion (\G) that allows multiple-match regular >> expressions to be able to use the position of the last match. Perl's >> documentation puts it this way: >> >> \G Match only at pos() (e.g. at the end-of-match position of prior >> m//g) >> >> Anyways, this is exceedingly powerful for matching regularly >> structured free-form records, and I was really surprised when I found >> out that python did not have it. For example, if findall supported >> this, it would be possible to write things like this (a quick and >> dirty ifconfig parser): >> >> pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) >> >> val = """ >> eth2 Link encap:Ethernet HWaddr xx >> inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx >> ... >> loLink encap:Local Loopback >>inet addr:127.0.0.1 Mask:255.0.0.0 >> """ >> matches = re.findall(pat, val) >> >> So - why doesn't python have this? is it something that simply was >> overlooked, or is there another method of doing the same thing with >> arbitrarily complex freeform records? >> >> thanks much.. >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] \G (match last position) regex operator non-existant in python?
All, perl has a regex assertion (\G) that allows multiple-match regular expressions to be able to use the position of the last match. Perl's documentation puts it this way: \G Match only at pos() (e.g. at the end-of-match position of prior m//g) Anyways, this is exceedingly powerful for matching regularly structured free-form records, and I was really surprised when I found out that python did not have it. For example, if findall supported this, it would be possible to write things like this (a quick and dirty ifconfig parser): pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) val = """ eth2 Link encap:Ethernet HWaddr xx inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx ... loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 """ matches = re.findall(pat, val) So - why doesn't python have this? is it something that simply was overlooked, or is there another method of doing the same thing with arbitrarily complex freeform records? thanks much.. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com