Re: What to use for finding as many syntax errors as possible.
On Sunday, October 9, 2022 at 12:09:45 PM UTC+2, Antoon Pardon wrote: > I would like a tool that tries to find as many syntax errors as possible > in a python file. I know there is the risk of false positives when a > tool tries to recover from a syntax error and proceeds but I would > prefer that over the current python strategy of quiting after the first > syntax error. I just want a tool for syntax errors. No style > enforcements. Any recommandations? -- Antoon Pardon Bit late here, coming from the Pycoder's Weekly email newsletter, but I'm surprised that I don't see any mentions of [parso](https://parso.readthedocs.io/en/latest/): > Parso is a Python parser that supports error recovery and round-trip parsing > for different Python versions (in multiple Python versions). Parso is also > able to list multiple syntax errors in your python file. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-11 14:11:56 -0400, Thomas Passin wrote: > To bring things back to the context of the original post, actual web > browsers are extremely tolerant of HTML syntax errors (including incorrect > nesting of tags) in the documents they receive. HTML5 actually specifies exactly how to recover from errors. So since every sequence of bytes results in a well-defined DOM tree you might argue (a bit tongue in cheek) that there are no syntax errors in HTML5. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-13 11:23:40 +1100, Chris Angelico wrote: > On Thu, 13 Oct 2022 at 11:19, Peter J. Holzer wrote: > > On 2022-10-11 09:47:52 +1100, Chris Angelico wrote: > > > On Tue, 11 Oct 2022 at 09:18, Cameron Simpson wrote: > > > > > > > Consider: > > > > > > if condition # no colon > > > code > > > else: > > > code > > > > > > To actually "restart" parsing, you have to make a guess of some sort. > > > > Right. At least one of the papers on parsing I read over the last few > > years (yeah, I really should try to find them again) argued that the > > vast majority of syntax errors is either a missing token, a superfluous > > token or a combination of the the two. So one strategy with good results > > is to heuristically try to insert or delete single tokens and check > > which results in the longest distance to the next error. > > > > Checking multiple possible fixes has its cost, especially since you have > > to do that at every error. So you can argue that it is better for > > productivity if you discover one error in 0.1 seconds than 10 errors in > > 5 seconds. > > Maybe; but what if you report 10 errors in 5 seconds, but 8 of them > are spurious? You've reported two useful errors in a sea of noise. > Even if it's the other way around (8 where you nailed it and correctly > reported the error, 2 that are nonsense), is it actually helpful? Humans are pattern-matching animals. It is quite possible that seeing a bunch of related errors makes the fix more obvious than seeing them in isolation. No, I haven't done any studies on this. Yes, it is possible that all those compiler writers who spent lots of work on error recovery over the last 50 years (or longer) are delusional. > > > > I grew up with C and Pascal compilers which would _happily_ produce many > > > > complaints, usually accurate, and all manner of syntactic errors. They > > > > didn't stop at the first syntax error. > > > > > > Yes, because they work with a much simpler grammar. > > > > I very much doubt that. Python doesn't have a particularly complicated > > grammar, and C certainly doesn't have a particularly simple one. > > > > The argument that it's impossible in Python (unlike any other language), > > because Python is oh so special doesn't hold water. > > > > Never said it's because Python is special; there are a LOT of > languages that are at least as complicated. And almost all of their compilers do try to recover from errors. > But I do think that Pascal, especially, has a significantly simpler > grammar than Python does. Incidentally, Turbo Pascal was the one other example of a compiler which *didn't* try to recover. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Thu, 13 Oct 2022 at 11:23, dn wrote: > # add an extra character within identifier, as if 'new' identifier > 28 assert expected_value == fyibonacci_number > UUU > > # these all trivial SYNTAX errors - could have tried leaving-out a > keyword, but ... Just to be clear, this last one is not actually a *syntax* error - it's a misspelled name, but contextually, that is clearly a name and nothing else. These are much easier to report multiples of, and typical syntax highlighters will do so. Your other two examples were both syntactic discrepancies though. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Thu, 13 Oct 2022 at 11:19, Peter J. Holzer wrote: > > On 2022-10-11 09:47:52 +1100, Chris Angelico wrote: > > On Tue, 11 Oct 2022 at 09:18, Cameron Simpson wrote: > > > > > Consider: > > > > if condition # no colon > > code > > else: > > code > > > > To actually "restart" parsing, you have to make a guess of some sort. > > Right. At least one of the papers on parsing I read over the last few > years (yeah, I really should try to find them again) argued that the > vast majority of syntax errors is either a missing token, a superfluous > token or a combination of the the two. So one strategy with good results > is to heuristically try to insert or delete single tokens and check > which results in the longest distance to the next error. > > Checking multiple possible fixes has its cost, especially since you have > to do that at every error. So you can argue that it is better for > productivity if you discover one error in 0.1 seconds than 10 errors in > 5 seconds. Maybe; but what if you report 10 errors in 5 seconds, but 8 of them are spurious? You've reported two useful errors in a sea of noise. Even if it's the other way around (8 where you nailed it and correctly reported the error, 2 that are nonsense), is it actually helpful? Bear in mind that, if you can discover one syntax error in 0.1 seconds, you can do that check *the moment the user types a key* in the editor (which is more-or-less what happens with most syntax highlighting editors - some have a small delay to avoid being too noisy with error reporting, but same difference). Why report false errors when you can report errors one by one and know that they're true? > > > I grew up with C and Pascal compilers which would _happily_ produce many > > > complaints, usually accurate, and all manner of syntactic errors. They > > > didn't stop at the first syntax error. > > > > Yes, because they work with a much simpler grammar. > > I very much doubt that. Python doesn't have a particularly complicated > grammar, and C certainly doesn't have a particularly simple one. > > The argument that it's impossible in Python (unlike any other language), > because Python is oh so special doesn't hold water. > Never said it's because Python is special; there are a LOT of languages that are at least as complicated. Try giving multiple useful errors when there's a syntactic problem in SQL, for instance. But I do think that Pascal, especially, has a significantly simpler grammar than Python does. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 09/10/2022 23.09, Antoon Pardon wrote: I would like a tool that tries to find as many syntax errors as possible in a python file. I know there is the risk of false positives when a tool tries to recover from a syntax error and proceeds but I would prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style enforcements. Any recommandations? -- Antoon Pardon Am not sure if have really understood problem being addressed, because it seems 'answered' - perhaps the question says more about the tool-set being utilised... As someone who used to manually check and re-check code before submitting (first punched-cards, and later edited files) source to a compiler, it took some re-education to learn what to expect from a modern/language-intelligent IDE. The topic was a major interest back in the days of batch-compilers. Plus we had other tools, eg CREF/XREF utilities which produced cross-references of identifier usage - and illustrated typos in identifiers, usage before value-assignment, etc (per request from one respondent). Using an IDE which is inspecting source-code as it is being typed (or when an existing file is opened) will suggest what might?should be typed 'next' (a mixed blessing IMHO!), and secondly highlights errors until they are noticed and dealt-with. Some, especially warnings, can be safely ignored - and yes, some are spurious and SHOULD be ignored!. PyCharm* displays a number of indicators. The least intrusive appears in the top-right corner of the editor-tab listing, eg 8 errors, 2 warnings. So, apparently not 'stopping' at first error found. Within the source-code itself, there are high-lights and under-lines (in and amongst the syntax highlighting presentation/theme) - which I suppose are easier to notice during data-entry if one is a touch-typist. Accordingly, not much of a context for multiple errors to be committed during a single coding-session, but remaining un-noticed until 'the end'. For illustration, I took a simple tutorial* routine and deliberately introduced some/many of the types of error discussed within this thread. It would have been ideal to attach a graphic but here are some lines of code, under which I have attempted to represent a highlighted character (related to the line above) with an "H", and a (red) under-lined token with a "U". So, this is a feeble-attempt to show how the source is displayed and annotated by the IDE: # mis-type the tuple-assignment by adding semi-colon # which might also confuse Python into thinking of a second instruction 17 i, j = 0;, 1 H UH # replace under-line/under-score with space: s/b expected_value 25 for expected value, fibonacci_number in \ UU # mis-type the name of the zip built-in function 26 z ip( SERIES, fibonacci_generator() ): U # add an extra character within identifier, as if 'new' identifier 28 assert expected_value == fyibonacci_number UUU # these all trivial SYNTAX errors - could have tried leaving-out a keyword, but ... Assuming the problem is not noticed/handled as the text is being typed, and in addition to the coder reviewing the work, recognising problems, and dealing with them him-/her-self; the IDE offers two follow-up mechanisms: 1 a means to jump 'focus' from the site of one error to the next, whereupon a pop-up will describe the error, eg (line 28) "Unresolved reference 'expected_value'"; which illustrates one problem in-isolation. In this case, line 28 is 'at fault' despite the fact that the 'error' is a consequence of THE problem on line 25! 2 a "Problems" Tool Window can be displayed, which will list every error and warning, with pretty, colored, icons, and the same message per example above, together with the relevant line-number, (the first two entries, as-listed, are 'warnings', and the rest are described as "errors"): Need more values to unpack:17 Statement seems to have no effect:17 # so it has picked-up both of my nefarious intentions Statement expected, found Py:COMMA:17 # as above # NB the "Py:COMMA" is from tokenize (per @Chris contribution(s)) 'in' expected:25 # logical, but confused by the space Unresolved reference 'value':25 # pretty-much had no chance with so many faults in one statement! Unresolved reference 'fibonacci_number':25 # ditto Unresolved reference 'z':26 # absolutely! ':' expected:26 # evidently re-started after the "in" and did what it could with the "z" Unresolved reference 'expected_value':28 # it would be "resolved" but for the first error on line 25 Unresolved reference 'fyibonacci_number':28 # ahah! Apparently trying to use an identifier before declaring/defining # in reality, just another typo # that said, I created the issue by inserting the "y" # if I'd mistyped the ent
Re: What to use for finding as many syntax errors as possible.
On 2022-10-11 09:47:52 +1100, Chris Angelico wrote: > On Tue, 11 Oct 2022 at 09:18, Cameron Simpson wrote: > > > Consider: > > if condition # no colon > code > else: > code > > To actually "restart" parsing, you have to make a guess of some sort. Right. At least one of the papers on parsing I read over the last few years (yeah, I really should try to find them again) argued that the vast majority of syntax errors is either a missing token, a superfluous token or a combination of the the two. So one strategy with good results is to heuristically try to insert or delete single tokens and check which results in the longest distance to the next error. Checking multiple possible fixes has its cost, especially since you have to do that at every error. So you can argue that it is better for productivity if you discover one error in 0.1 seconds than 10 errors in 5 seconds. > > I grew up with C and Pascal compilers which would _happily_ produce many > > complaints, usually accurate, and all manner of syntactic errors. They > > didn't stop at the first syntax error. > > Yes, because they work with a much simpler grammar. I very much doubt that. Python doesn't have a particularly complicated grammar, and C certainly doesn't have a particularly simple one. The argument that it's impossible in Python (unlike any other language), because Python is oh so special doesn't hold water. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 11Oct2022 17:45, Thomas Passin wrote: Personally, I'd most likely go for a decent programming editor that you can set up to run a program on your file, use that to run a checker, like pyflakes for instance, and run that from time to time. You could run it when you save a file. Even if it only showed one error at a time, it would make quick work of correcting mistakes. And it wouldn't need to trigger an entire tool chain each time. Aye. I've got my editor (vim) configured to run an autoformatter on my code when I save (this can be turned off, and parse errors prevent any reformatting). Linters I run by hand from the adjacent shell window, via a small script which runs my preferred linters with their preferred options. My current workplace triggers the CI workflow when you push commits upstream, and you can make branch names which do not trigger the CI stuff. So there's a decent separation between saving (and testing or locally running the dev code) from the CI cycle. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10/11/2022 5:09 PM, Thomas Passin wrote: The OP wants to get help with problems in his files even if it isn't perfect, and I think that's reasonable to wish for. The link to a post about the lezer parser in a recent message on this thread is partly about how a real, practical parser can do some error correction in mid-flight, for the purposes of a programming editor (as opposed to one that has to build a correct program). One editor that seems to do what the OP wants is Visual Studio Code. It will mark apparent errors - not just syntax errors - not limited to one per page. Sometimes it can even suggest corrections. I personally dislike the visual clutter the markings impose, but I imagine I could get used to it. VSC uses a Microsoft system they call "PyLance" - see https://devblogs.microsoft.com/python/announcing-pylance-fast-feature-rich-language-support-for-python-in-visual-studio-code/ Of course, you don't get something complex for free, and in this case the cost is having to run a separate server to do all this analysis on the fly. However, VSC handles all of that behind the scenes so you don't have to. Personally, I'd most likely go for a decent programming editor that you can set up to run a program on your file, use that to run a checker, like pyflakes for instance, and run that from time to time. You could run it when you save a file. Even if it only showed one error at a time, it would make quick work of correcting mistakes. And it wouldn't need to trigger an entire tool chain each time. My editor of choice for setting up helper "tools" like this on Windows is Editplus (non-free but cheap and very worth it), and I have both py_compile and pyflakes set up this way in it. However, as I mentioned in an earlier post, the Leo Editor (https://github.com/leo-editor/leo-editor) does this for you automatically when you save, so it's very convenient. That's what I mostly work in. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10/11/2022 4:00 PM, Chris Angelico wrote: On Wed, 12 Oct 2022 at 05:23, Thomas Passin wrote: On 10/11/2022 3:10 AM, avi.e.gr...@gmail.com wrote: I see resemblances to something like how a web page is loaded and operated. I mean very different but at some level not so much. I mean a typical web page is read in as HTML with various keyword regions expected such as ... or ... with things often cleanly nested in others. The browser makes nodes galore in some kind of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has names like DOM. To bring things back to the context of the original post, actual web browsers are extremely tolerant of HTML syntax errors (including incorrect nesting of tags) in the documents they receive. They usually recover silently from errors and are able to display the rest of the page. Usually they manage this correctly. Having had to debug tiny errors in HTML pages that resulted in extremely weird behaviour, I'm not sure that I agree that they usually manage correctly. Fundamentally, they guess, and guesswork is never reliable. Still, browsers generally do a very decent job of recovery, even though perfection isn't possible. The OP wants to get help with problems in his files even if it isn't perfect, and I think that's reasonable to wish for. The link to a post about the lezer parser in a recent message on this thread is partly about how a real, practical parser can do some error correction in mid-flight, for the purposes of a programming editor (as opposed to one that has to build a correct program). -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Wed, 12 Oct 2022 at 05:23, Thomas Passin wrote: > > On 10/11/2022 3:10 AM, avi.e.gr...@gmail.com wrote: > > I see resemblances to something like how a web page is loaded and operated. > > I mean very different but at some level not so much. > > > > I mean a typical web page is read in as HTML with various keyword regions > > expected such as ... or ... with things > > often cleanly nested in others. The browser makes nodes galore in some kind > > of tree format with an assortment of objects whose attributes or methods > > represent aspects of what it sees. The resulting treelike structure has > > names like DOM. > > To bring things back to the context of the original post, actual web > browsers are extremely tolerant of HTML syntax errors (including > incorrect nesting of tags) in the documents they receive. They usually > recover silently from errors and are able to display the rest of the > page. Usually they manage this correctly. Having had to debug tiny errors in HTML pages that resulted in extremely weird behaviour, I'm not sure that I agree that they usually manage correctly. Fundamentally, they guess, and guesswork is never reliable. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10/11/2022 3:10 AM, avi.e.gr...@gmail.com wrote: I see resemblances to something like how a web page is loaded and operated. I mean very different but at some level not so much. I mean a typical web page is read in as HTML with various keyword regions expected such as ... or ... with things often cleanly nested in others. The browser makes nodes galore in some kind of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has names like DOM. To bring things back to the context of the original post, actual web browsers are extremely tolerant of HTML syntax errors (including incorrect nesting of tags) in the documents they receive. They usually recover silently from errors and are able to display the rest of the page. Usually they manage this correctly. The OP would like to have a parser or checker that could do the same, plus giving an output showing where each of the errors happened. I can imagine such a parser also reporting which lines it had to skip before it was able to recover. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Tue, 11 Oct 2022 at 18:12, wrote: > > Thanks for a rather detailed explanation of some of what we have been > discussing, Chris. The overall outline is about what I assumed was there but > some of the details were, to put it politely, fuzzy. > > I see resemblances to something like how a web page is loaded and operated. > I mean very different but at some level not so much. > > I mean a typical web page is read in as HTML with various keyword regions > expected such as ... or ... with things > often cleanly nested in others. The browser makes nodes galore in some kind > of tree format with an assortment of objects whose attributes or methods > represent aspects of what it sees. The resulting treelike structure has > names like DOM. Yes. The basic idea of "tokenize, parse, compile" can be used for pretty much any language - even English, although its grammar is a bit more convoluted than most programming languages, with many weird backward compatibility features! I'll parse your last sentence above: LETTERS The SPACE LETTERS resulting SPACE ... you get the idea LETTERS like SPACE LETTERS DOM FULLSTOP # or call this token PERIOD if you're American Now, we can group those tokens into meaningful sets. Sentence(type=Statement, subject=Noun(name="structure", addenda=[ Article(type=The), Adjective(name="treelike"), ]), verb=Verb(type=Being, name="has", addenda=[]), object=Noun(name="name", plural=True, addenda=[ Adjective(phrase=Phrase(verb=Verb(name="like"), object=Noun(name="DOM"), ]), ) Grammar nerds will probably dispute some of the awful shorthanding I did here, but I didn't want to devise thousands of AST nodes just for this :) > To a certain approximation, this tree starts a certain way but is regularly > being manipulated (or perhaps a copy is) as it regularly is looked at to see > how to display it on the screen at the moment based on the current tree > contents and another set of rules in Cascading Style Sheets. Yep; the DOM tree is initialized from the HTML (usually - it's possible to start a fresh tree with no HTML) and then can be manipulated afterwards. > These are not at all the same thing but share a certain set of ideas and > methods and can be very powerful as things interact. Oh absolutely. That's why there are languages designed to help you define other languages. > In effect the errors in the web situation have such analogies too as in what > happens if a region of HTML is not well-formed or uses a keyword not > recognized. And they're horribly horribly messy, due to a few decades of sloppy HTML programmers and the desire to still display the page even if things are messed up :) But, again, there's a huge difference between syntactic errors (like omitting a matching angle bracket) and semantic errors (a keyword not known, like using when you should have used ). In the latter case, you can still build a DOM tree, but you have an unknown element; in the former case, you have to guess at what the author meant, just to get anything going at all. > There was a guy around a few years ago who suggested he would create a > system where you could create a series of some kind of configuration files > for ANY language and his system would them compile or run programs for each > and every such language? Was that on this forum? What ever happened to him? That was indeed on this forum, and I have no idea what happened to him. Maybe he realised that all he'd invented was the Unix shebang? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Sure it does. They’re optional and not enforced at runtime, but I find them useful when writing code in PyCharm: import os from os import DirEntry de : DirEntry for de in os.scandir('/tmp'): print(de.name) de = 7 print(de) Predeclaring de allows me to do the tab completion thing with DirEntry fields / methods From: Python-list on behalf of avi.e.gr...@gmail.com Date: Monday, October 10, 2022 at 10:11 PM To: python-list@python.org Subject: RE: What to use for finding as many syntax errors as possible. *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** Michael, A reasonable question. Python lets you initialize variables but has no explicit declarations. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 10/10/2022 om 19:08 schreef Robert Latest via Python-list: Antoon Pardon wrote: I would like a tool that tries to find as many syntax errors as possible in a python file. I'm puzzled as to when such a tool would be needed. How many syntax errors can you realistically put into a single Python file before compiling it for the first time? Why are you puzzled? I don't need to make that many syntaxt errors to find such a tool useful. -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 10/10/2022 om 19:08 schreef Robert Latest via Python-list: Antoon Pardon wrote: > I would like a tool that tries to find as many syntax errors as possible > in a python file. I'm puzzled as to when such a tool would be needed. How many syntax errors can you realistically put into a single Python file before compiling it for the first time? I've been following the discussion from a distance and the whole time I've been wondering the same thing. Especially when you have unit tests, as Antoon said he has, I can't really imagine a situation where you add so much code in one go without running it that you introduce a painful amount of syntax errors. My solution would be to use a modern IDE with a linter, possibly with style warnings disabled, which will flag syntax errors as soon as you type them. Possibly combined with a TDD-style tactic which also prevents large amounts of errors (any errors) to build up. But I have the impression that any of those doesn't fit in Antoon's workflow. -- "Peace cannot be kept by force. It can only be achieved through understanding." -- Albert Einstein -- https://mail.python.org/mailman/listinfo/python-list
RE: What to use for finding as many syntax errors as possible.
Thanks for a rather detailed explanation of some of what we have been discussing, Chris. The overall outline is about what I assumed was there but some of the details were, to put it politely, fuzzy. I see resemblances to something like how a web page is loaded and operated. I mean very different but at some level not so much. I mean a typical web page is read in as HTML with various keyword regions expected such as ... or ... with things often cleanly nested in others. The browser makes nodes galore in some kind of tree format with an assortment of objects whose attributes or methods represent aspects of what it sees. The resulting treelike structure has names like DOM. To a certain approximation, this tree starts a certain way but is regularly being manipulated (or perhaps a copy is) as it regularly is looked at to see how to display it on the screen at the moment based on the current tree contents and another set of rules in Cascading Style Sheets. But bits and pieces of JavaScript are also embedded or imported that can read aspects of the tree (and more) and modify the contents and arrange for all kinds of asynchronous events when bits of code are invoked such as when you click a button or hover or when an image finishes loading or every 100 milliseconds. It can insert new objects into the DOM too. And of course there can be interactions with restricted local storage as well as with servers and code running there. It is quite a mess but in some ways I see analogies. Your program reads a stream of data and looks for tokens and eventually turns things into a tree of sorts that represents relationships to a point. Additional structures eventually happen at run time that let you store collections of references to variables such as environments or namespaces and the program derived from the trees makes changes as it goes and in a language like Python can even possibly change the running program in some ways. These are not at all the same thing but share a certain set of ideas and methods and can be very powerful as things interact. In the web case, the CSS may search for regions with some class or ID or that are the third element of a bullet list and more, using powerful tools like jQuery, and make changes. A CSS rule that previously ignored some region as not having a particular class, might start including it after a JavaScript segment is aroused while waiting on an event listener for say a mouse hovering over an area and then changes that part of the DOM (like a node) to be in that class. Suddenly the area on your screen changes background or whatever the CSS now dictates. We have multiple systems written in an assortment of "languages" that complement each other. Some running programs, especially ones that use asynchronous methods like threads or callbacks on events, such as a GUI, can effectively do similar things. In effect the errors in the web situation have such analogies too as in what happens if a region of HTML is not well-formed or uses a keyword not recognized. This becomes even more interesting in XML where anything can be a keyword and you often need other kinds of files (often also in ML) to define what the XML can be like and what restrictions it may have such as can a have multiple authors but only one optional publication date and so on. It can be fascinating and highly technical. So I am up for a challenge of studying anything from early compilers for languages of my youth to more recent ways including some like what you show. I have time to kill and this might be more fun than other things, for a while. There was a guy around a few years ago who suggested he would create a system where you could create a series of some kind of configuration files for ANY language and his system would them compile or run programs for each and every such language? Was that on this forum? What ever happened to him? But although what he promised seemed a bit too much, I can see from your comments below how in some ways a limited amount of that might be done for some subset of languages which can be parsed and manipulated as described. -Original Message- From: Python-list On Behalf Of Chris Angelico Sent: Monday, October 10, 2022 11:55 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On Tue, 11 Oct 2022 at 14:26, wrote: > > I stand corrected Chris, and others, as I pay the sin tax. > > Yes, there are many kinds of errors that logically fall into different > categories or phases of evaluation of a program and some can be > determined by a more static analysis almost on a line by line (or > "statement" or "expression", ...) basis and others need to sort of > simulate some things and look back and forth to detect possible > incompatibilities and yet others can only be detected at run time and > likely way more categories depending on the language. > > But
What to use for finding as many syntax errors as possible.
I think we are in agreement here, Chris. My point is that the error detection and correction is now done at levels where there is not much need to use earlier and inefficient methods like parity bits set aside. We use protocols like TCP and IP and layers above them and above those to maintain the integrity of packets and sessions and forms of encryption allowing things like authentication. There is tons of overhead, even when some is fairly efficient, but we hardly notice it unless things go wrong. So written language sent (as in this email/post) does not need lots of redundancy and all the extra effort is, IMNSHO opinion, largely wasted. If I see a bear, I do not wish to check their genitals or DNA to determine their irrelevant gender before asking someone to run from it. If I happen to know the gender, as in a zoo, gender only matters for things like breeding purposes. I do not want to memorize terms in languages that have not only words like lion and lioness or duck and drake and goose and gander, but for EVERYTHING in some sense so I can say the equivalent of ANIMAL-male and ANIMAL-female with unique words. Life would be so much simpler if I could say your dog was nice and not be corrected that it was a bitch and I used the wrong word endings. If I really wanted to say it was a female dog, well I could just add a qualified. Most of the time, who cares? The same applies to so much grammatical nonsense which is also usually riddled with endless exceptions to the many rules. Make the languages simple with little redundancy and thus far easier to learn. I can say similar things about some programming languages that either have way too many rules or too few of the right ones. There are tradeoffs and if you want a powerful language it will likely not be easy to control. If you want a very regulated language, you may find it not very useful as many things are hard to do ad others not possible. I know that strongly typed languages often have to allow some method of cheating such as unions of data types, or using a parent class as the sort of object-type to allow disparate objects to live together. Python is far from the most complex but as noted, it is not trivial to evaluate even the syntax past errors. But I admit it is fun and a challenge to learn both kinds and I spent much of my time doing so. I like the flexibility of seeing different approaches and holding contradictions in my mind while accepting both and yet neither! LOL! -Original Message- From: Python-list On Behalf Of Chris Angelico Sent: Monday, October 10, 2022 11:24 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On Tue, 11 Oct 2022 at 14:13, wrote: > With the internet today, we are used to expecting error correction to > come for free. Do you really need one of every 8 bits to be a parity > bit, which only catches may half of the errors... Fortunately, we have WAY better schemes than simple parity, which was only really a thing in the modem days. (Though I would say that there's still a pretty clear distinction between a good message where everything has correct parity, and line noise where half of them don't.) Hamming codes can correct one-bit errors (and detect two-bit errors) at a price of log2(size)+1 bits of space. Here's a great rundown: https://www.youtube.com/watch?v=X8jsijhllIA There are other schemes too, but Hamming codes are beautifully elegant and easy to understand. ChrisA -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Tue, 11 Oct 2022 at 14:26, wrote: > > I stand corrected Chris, and others, as I pay the sin tax. > > Yes, there are many kinds of errors that logically fall into different > categories or phases of evaluation of a program and some can be determined > by a more static analysis almost on a line by line (or "statement" or > "expression", ...) basis and others need to sort of simulate some things > and look back and forth to detect possible incompatibilities and yet others > can only be detected at run time and likely way more categories depending on > the language. > > But when I run the Python interpreter on code, aren't many such phases done > interleaved and at once as various segments of code are parsed and examined > and perhaps compiled into block code and eventually executed? Hmm, depends what you mean. Broadly speaking, here's how it goes: 0) Early pre-parse steps that don't really matter to most programs, like checking character set. We'll ignore these. 1) Tokenize the text of the program into a sequence of potentially-meaningful units. 2) Parse those tokens into some sort of meaningful "sentence". 3) Compile the syntax tree into actual code. 4) Run that code. Example: >>> code = """def f(): ... print("Hello, world", 1>=2) ... print(Ellipsis, ...) ... return True ... """ >>> In step 1, all that happens is that a stream of characters (or bytes, depending on your point of view) gets broken up into units. >>> for t in tokenize.tokenize(iter(code.encode().split(b"\n")).__next__): ... print(tokenize.tok_name[t.exact_type], t.string) It's pretty spammy, but you can see how the compiler sees the text. Note that, at this stage, there's no real difference between the NAME "def" and the NAME "print" - there are no language keywords yet. Basically, all you're doing is figuring out punctuation and stuff. Step 2 is what we'd normally consider "parsing". (It may well happen concurrently and interleaved with tokenizing, and I'm giving a simplified and conceptualized pipeline here, but this is broadly what Python does.) This compares the stream of tokens to the grammar of a Python program and attempts to figure out what it means. At this point, the linear stream turns into a recursive syntax tree, but it's still very abstract. >>> import ast >>> ast.dump(ast.parse(code)) "Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Constant(value='Hello, world'), Compare(left=Constant(value=1), ops=[GtE()], comparators=[Constant(value=2)])], keywords=[])), Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Name(id='Ellipsis', ctx=Load()), Constant(value=Ellipsis)], keywords=[])), Return(value=Constant(value=True))], decorator_list=[])], type_ignores=[])" (Side point: I would rather like to be able to pprint.pprint(ast.parse(code)) but that isn't a thing, at least not currently.) This is where the vast majority of SyntaxErrors come from. Your code is a sequence of tokens, but those tokens don't mean anything. It doesn't make sense to say "print(def f[return)]" even though that'd tokenize just fine. The trouble with the notion of "keeping going after finding an error" is that, when you find an error, there are almost always multiple possible ways that this COULD have been interpreted differently. It's as likely to give nonsense results as actually useful ones. (Note that, in contrast to the tokenization stage, this version distinguishes between the different types of word. The "def" has resulted in a FunctionDef node, the "print" is a Name lookup, and both "..." and "True" have now become Constant nodes - previously, "..." was a special Ellipsis token, but "True" was just a NAME.) Step 3: the abstract syntax tree gets parsed into actual runnable code. This is where that small handful of other SyntaxErrors come from. With these errors, you absolutely _could_ carry on and report multiple; but it's not very likely that there'll actually *be* more than one of them in a file. Here's some perfectly valid AST parsing: >>> ast.dump(ast.parse("from __future__ import the_past")) "Module(body=[ImportFrom(module='__future__', names=[alias(name='the_past')], level=0)], type_ignores=[])" >>> ast.dump(ast.parse("from __future__ import braces")) "Module(body=[ImportFrom(module='__future__', names=[alias(name='braces')], level=0)], type_ignores=[])" >>> ast.dump(ast.parse("def f():\n\tdef g():\n\t\tnonlocal x\n")) "Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[FunctionDef(name='g', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Nonlocal(names=['x'])], decorator_list=[])], decorator_list=[])], type_ignores=[])" If you were to try to actually compile those to bytecode, they would fail: >>> compile(ast.parse("from __future__ import braces"), "-", "exec")
Re: What to use for finding as many syntax errors as possible.
On Tue, 11 Oct 2022 at 14:13, wrote: > With the internet today, we are used to expecting error correction to come > for free. Do you really need one of every 8 bits to be a parity bit, which > only catches may half of the errors... Fortunately, we have WAY better schemes than simple parity, which was only really a thing in the modem days. (Though I would say that there's still a pretty clear distinction between a good message where everything has correct parity, and line noise where half of them don't.) Hamming codes can correct one-bit errors (and detect two-bit errors) at a price of log2(size)+1 bits of space. Here's a great rundown: https://www.youtube.com/watch?v=X8jsijhllIA There are other schemes too, but Hamming codes are beautifully elegant and easy to understand. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
RE: What to use for finding as many syntax errors as possible.
I stand corrected Chris, and others, as I pay the sin tax. Yes, there are many kinds of errors that logically fall into different categories or phases of evaluation of a program and some can be determined by a more static analysis almost on a line by line (or "statement" or "expression", ...) basis and others need to sort of simulate some things and look back and forth to detect possible incompatibilities and yet others can only be detected at run time and likely way more categories depending on the language. But when I run the Python interpreter on code, aren't many such phases done interleaved and at once as various segments of code are parsed and examined and perhaps compiled into block code and eventually executed? So is the OP asking for something other than a Python Interpreter that normally halts after some kind of error? Tools like a linter may indeed fit that mold. This may limit some of the objections of when an error makes it hard for the parser to find some recovery point to continue from as no code is being run and no harmful side effects happen by continuing just an analysis. Time to go read some books about modern ways to evaluate a language based on more mathematical rules including more precisely what is syntax versus ... Suggestions? -Original Message- From: Python-list On Behalf Of Chris Angelico Sent: Monday, October 10, 2022 10:42 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On Tue, 11 Oct 2022 at 13:10, wrote: > If the above is: > > Import grumpy as np > > Then what happens if the code tries to find a file named "grumpy" > somewhere and cannot locate it and this is considered a syntax error > rather than a run-time error for whatever reason? Can you continue > when all kinds of functionality is missing and code asking to make a > np.array([1,2,3]) clearly fails? That's not a syntax error. Syntax is VERY specific. It is an error in Python to attempt to add 1 to "one", it is an error to attempt to look up the upper() method on None, it is an error to try to use a local variable you haven't assigned to yet, and it is an error to open a file that doesn't exist. But not one of these is a *syntax* error. Syntax errors are detected at the parsing stage, before any code gets run. The vast majority of syntax errors are grammar errors, where the code doesn't align with the parseable text of a Python program. (Non-grammatical parsing errors include using a "nonlocal" statement with a name that isn't found in any surrounding scope, using "await" in a non-async function, and attempting to import braces from the future.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
RE: What to use for finding as many syntax errors as possible.
Cameron, or OP if you prefer, I think by now you have seen a suggestion that languages make choices and highly structured ones can be easier to "recover" from errors and try to continue than some with way more complex possibilities that look rather unstructured. What is the error in code like this? A,b,c,d = 1,2, Or is it an error at all? Many languages have no concept of doing anything like the above and some tolerate a trailing comma and some set anything not found to some form of NULL or uninitialized and some ... If you look at human language, some are fairly simple and some are way too organized. But in a way it can make sense. Languages with gender will often ask you to change the spelling and often how you pronounce things not only based on whether a noun is male/female or even neuter but also insist you change the form of verbs or adjectives and so on that in effect give multiple signals that all have to line up to make a valid and understandable sentence. Heck, in conversations, people can often leave out parts of a sentence such as whether you are talking about "I" or "you" or "she" or "we" because the rest of the words in the sentence redundantly force only one choice to be possible. So some such annoying grammars (in my opinion) are error detection/correction codes in disguise. In days before microphones and speakers, it was common to not hear people well, like on a stage a hundred feet away with other ambient noises. Missing a word or two might still allow you to get the point as other parts of the sentence did such redundancies. Many languages have similar strictures letting you know multiple times if something is singular or plural. And I think another reason was what I call stranger detection. People who learn some vocabulary might still not speak correctly and be identifiable as strangers, as in spies. Do we need this in the modern age? Who knows! But it makes me prefer some languages over others albeit other reasons may ... With the internet today, we are used to expecting error correction to come for free. Do you really need one of every 8 bits to be a parity bit, which only catches may half of the errors, when the internals of your computer are relatively error free and even the outside is protected by things like various protocols used in making and examining packets and demanding some be sent again if some checksum does not match? Tons of checking is built in so at your level you rarely think about it. If you get a message, it usually is either 99.% accurate, or you do not have it shown to you at all. I am not talking about SPAM but about errors of transmission. So my analogies are that if you want a very highly structured language that can recover somewhat from errors, Python may not be it. And over the years as features are added or modified, the structure tends to get more complex. And R is not alone. Many surviving languages continue to evolve and borrow from each other and any program that you run today that could partially recover and produce pages of possible errors, may blow up when new features are introduced. And with UNICODE, the number of possible "errors" in what is placed in code for languages like Julia that allow them in most places ... -Original Message- From: Python-list On Behalf Of Cameron Simpson Sent: Monday, October 10, 2022 6:17 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On 11Oct2022 08:02, Chris Angelico wrote: >There's a huge difference between non-fatal errors and syntactic >errors. The OP wants the parser to magically skip over a fundamental >syntactic error and still parse everything else correctly. That's never >going to work perfectly, and the OP is surprised at this. The OP is not surprised by this, and explicitly expressed awareness that resuming a parse had potential for "misparsing" further code. I remain of the opinion that one could resume a parse at the next unindented line and get reasonable results a lot of the time. In fact, I expect that one could resume tokenising at almost any line which didn't seem to be inside a string and often get reasonable results. I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They didn't stop at the first syntax error. All you need in principle is a parser which goes "report syntax error here, continue assuming ". For Python that might mean "pretend a missing final colon" or "close open brackets" etc, depending on the context. If you make conservative implied corrections you can get a reasonable continued parse, enough to find further syntax errors. I remember the Pascal compiler in particular had a really good "you missed a semicolon _back there_" mode which was almost alwa
Re: What to use for finding as many syntax errors as possible.
On Tue, 11 Oct 2022 at 13:10, wrote: > If the above is: > > Import grumpy as np > > Then what happens if the code tries to find a file named "grumpy" somewhere > and cannot locate it and this is considered a syntax error rather than a > run-time error for whatever reason? Can you continue when all kinds of > functionality is missing and code asking to make a np.array([1,2,3]) clearly > fails? That's not a syntax error. Syntax is VERY specific. It is an error in Python to attempt to add 1 to "one", it is an error to attempt to look up the upper() method on None, it is an error to try to use a local variable you haven't assigned to yet, and it is an error to open a file that doesn't exist. But not one of these is a *syntax* error. Syntax errors are detected at the parsing stage, before any code gets run. The vast majority of syntax errors are grammar errors, where the code doesn't align with the parseable text of a Python program. (Non-grammatical parsing errors include using a "nonlocal" statement with a name that isn't found in any surrounding scope, using "await" in a non-async function, and attempting to import braces from the future.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
RE: What to use for finding as many syntax errors as possible.
Michael, A reasonable question. Python lets you initialize variables but has no explicit declarations. Languages differ and I juggle attributes of many in my mind and am reacting to the original question NOT about whether and how Python should report many possible errors all at once but how ANY language can be expected to do this well. Many others do have a variable declaration phase or an optional declaration or perhaps just a need to declare a function prototype so it can be used by others even if the formal function creation will happen later in the code. But what I meant in a Python context was something like this: Wronk = who cares # this should fail ... If (Wronk > 5): ... ... Wronger = Wronk + 1 ... X = minimum(Wronk, Wronger, 12) The first line does not parse well so you have an error. But in any case as the line makes no sense, Wronk is not initialized to anything. Later code may use it in various ways and some of those may be seen as errors for an assortment of reasons, then at one point the code does provide a value for Wronk and suddenly code beyond that has no seeming errors. The above examples are not meant to be real but just give a taste that programs with holes in them for any reason may not be consistent. The only relatively guaranteed test for sanity has to start at the top and encounter no errors or missing parts based on an anything such as I/O errors. And I suggest there are some things sort of declared in python such as: Import numpy as np Yes, that brings in code from a module if it works and initializes a variable called np to sort of point at the module or it's namespace or whatever, depending on the language. It is an assignment but also a way to let the program know things. If the above is: Import grumpy as np Then what happens if the code tries to find a file named "grumpy" somewhere and cannot locate it and this is considered a syntax error rather than a run-time error for whatever reason? Can you continue when all kinds of functionality is missing and code asking to make a np.array([1,2,3]) clearly fails? Many of us here are talking past each other. Yes, it would be nice to get lots of info and arguably we may eventually have machine-learning or AI programs a bit more like SPAM detectors that look for patterns commonly found and try to fix your program from common errors or at least do a temporary patch so they can continue searching for more errors. This could result in the best case in guessing right every time. If you allowed it to actually fix your code, it might be like people who let their spelling be corrected and do not proofread properly and send out something embarrassing or just plain wrong! And it will compile or be interpreted without complaint albeit not do exactly what it is supposed to! -Original Message- From: Python-list On Behalf Of Michael F. Stemper Sent: Monday, October 10, 2022 9:22 AM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On 09/10/2022 10.49, Avi Gross wrote: > Anton > > There likely are such programs out there but are there universal > agreements on how to figure out when a new safe zone of code starts > where error testing can begin? > > For example a file full of function definitions might find an error in > function 1 and try to find the end of that function and resume > checking the next function. But what if a function defines local functions within it? > What if the mistake in one line of code could still allow checking the > next line rather than skipping it all? > > My guess is that finding 100 errors might turn out to be misleading. > If you fix just the first, many others would go away. If you spell a > variable name wrong when declaring it, a dozen uses of the right name may cause errors. > Should you fix the first or change all later ones? How does one declare a variable in python? Sometimes it'd be nice to be able to have declarations and any undeclared variable be flagged. When I was writing F77 for a living, I'd (temporarily) put: IMPLICIT CHARACTER*3 at the beginning of a program or subroutine that I was modifying, in order to have any typos flagged. I'd love it if there was something similar that I could do in python. -- Michael F. Stemper 87.3% of all statistics are made up by the person giving them. -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10/10/2022 9:21 AM, Michael F. Stemper wrote: On 09/10/2022 10.49, Avi Gross wrote: Anton There likely are such programs out there but are there universal agreements on how to figure out when a new safe zone of code starts where error testing can begin? For example a file full of function definitions might find an error in function 1 and try to find the end of that function and resume checking the next function. But what if a function defines local functions within it? What if the mistake in one line of code could still allow checking the next line rather than skipping it all? My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. If you spell a variable name wrong when declaring it, a dozen uses of the right name may cause errors. Should you fix the first or change all later ones? How does one declare a variable in python? Sometimes it'd be nice to be able to have declarations and any undeclared variable be flagged. When I was writing F77 for a living, I'd (temporarily) put: IMPLICIT CHARACTER*3 at the beginning of a program or subroutine that I was modifying, in order to have any typos flagged. I'd love it if there was something similar that I could do in python. The Leo editor (https://github.com/leo-editor/leo-editor) will notify you of undeclared variables (and some syntax errors) each time you save your (Python) file. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Tue, 11 Oct 2022 at 09:18, Cameron Simpson wrote: > > On 11Oct2022 08:02, Chris Angelico wrote: > >There's a huge difference between non-fatal errors and syntactic > >errors. The OP wants the parser to magically skip over a fundamental > >syntactic error and still parse everything else correctly. That's > >never going to work perfectly, and the OP is surprised at this. > > The OP is not surprised by this, and explicitly expressed awareness that > resuming a parse had potential for "misparsing" further code. > > I remain of the opinion that one could resume a parse at the next > unindented line and get reasonable results a lot of the time. The next line at the same indentation level as the line with the error, or the next flush-left line? Either way, there's a weird and arbitrary gap before you start parsing again, and you still have no indication of what could make sense. Consider: if condition # no colon code else: code To actually "restart" parsing, you have to make a guess of some sort. Maybe you can figure out what the user meant to do, and parse accordingly; but if that's the case, keep going immediately, don't wait for an unindented line. If you want for a blank line followed by an unindented line, that might help with a notion of "next logical unit of code", but it's very much dependent on the coding style, and if you have a codebase that's so full of syntax errors that you actually want to see more than one, you probably don't have a codebase with pristine and beautiful code layout. > In fact, I expect that one could resume tokenising at almost any line > which didn't seem to be inside a string and often get reasonable > results. "Seem to be"? On what basis? > I grew up with C and Pascal compilers which would _happily_ produce many > complaints, usually accurate, and all manner of syntactic errors. They > didn't stop at the first syntax error. Yes, because they work with a much simpler grammar. But even then, most syntactic errors (again, this is not to be confused with semantic errors - if you say "char *x = 1.234;" then there's no parsing ambiguity but it's not going to compile) cause a fair degree of nonsense afterwards. The waters are a bit muddied by some things being called "syntax errors" when they're actually nothing at all to do with the parser. For instance: >>> def f(): ... await q ... File "", line 2 SyntaxError: 'await' outside async function This is not what I'm talking about; there's no parsing ambiguity here, and therefore no difficulty whatsoever in carrying on with the parsing. You could ast.parse() this code without an error. But resuming after a parsing error is fundamentally difficult, impossible without guesswork. > All you need in principle is a parser which goes "report syntax error > here, continue assuming ". For Python that might mean > "pretend a missing final colon" or "close open brackets" etc, depending > on the context. If you make conservative implied corrections you can get > a reasonable continued parse, enough to find further syntax errors. And, more likely, you'll generate a lot of nonsense. Take something like this: items = [ item[1], item2], item[3], ] As a human, you can easily see what the problem is. Try teaching a parser how to handle this. Most likely, you'll generate a spurious error - maybe the indentation, maybe the intended end of the list - but there's really only one error here. Reporting multiple errors isn't actually going to be at all helpful. > I remember the Pascal compiler in particular had a really good "you > missed a semicolon _back there_" mode which was almost always correct, a > nice boon when correcting mistakes. > Ahh yes. Design a language with strict syntactic requirements, and it's not too hard to find where the programmer has omitted them. Thing is Python just doesn't HAVE those semicolons. Let's say that a variant Python required you to put a U+251C ├ at the start of every statement, and U+2524 ┤ at the end of the statement. A whole lot of classes of error would be extremely easy to notice and correct, and thus you could resume parsing; but that isn't benefiting the programmer any. When you don't have that kind of information duplication, it's a lot harder to figure out how to cheat the fix and go back to parsing. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 09/10/2022 10.49, Avi Gross wrote: My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. If you spell a variable name wrong when declaring it, a dozen uses of the right name may cause errors. Should you fix the first or change all later ones? Just to this, these are semantic errors, not syntax errors. Linters do an ok job of spotting these. Antoon is after _syntax errors_. On 10Oct2022 08:21, Michael F. Stemper wrote: How does one declare a variable in python? Sometimes it'd be nice to be able to have declarations and any undeclared variable be flagged. Linters do pretty well at this. They can trace names and their use compared to their first definition/assignment (often - there are of course some constructs which are correct but unclear to a static analysis - certainly one of my linters occasionally says "possible undefine use" to me because there may be a path to use before set). This is particularly handy for typos, which often make for "use before set" or "set and not used". I'd love it if there was something similar that I could do in python. Have you used any lint programmes? My "lint" script runs pyflakes and pylint. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 11Oct2022 08:02, Chris Angelico wrote: There's a huge difference between non-fatal errors and syntactic errors. The OP wants the parser to magically skip over a fundamental syntactic error and still parse everything else correctly. That's never going to work perfectly, and the OP is surprised at this. The OP is not surprised by this, and explicitly expressed awareness that resuming a parse had potential for "misparsing" further code. I remain of the opinion that one could resume a parse at the next unindented line and get reasonable results a lot of the time. In fact, I expect that one could resume tokenising at almost any line which didn't seem to be inside a string and often get reasonable results. I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They didn't stop at the first syntax error. All you need in principle is a parser which goes "report syntax error here, continue assuming ". For Python that might mean "pretend a missing final colon" or "close open brackets" etc, depending on the context. If you make conservative implied corrections you can get a reasonable continued parse, enough to find further syntax errors. I remember the Pascal compiler in particular had a really good "you missed a semicolon _back there_" mode which was almost always correct, a nice boon when correcting mistakes. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Antoon Pardon wrote: > I would like a tool that tries to find as many syntax errors as possible > in a python file. I'm puzzled as to when such a tool would be needed. How many syntax errors can you realistically put into a single Python file before compiling it for the first time? -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Michael F. Stemper wrote: > How does one declare a variable in python? Sometimes it'd be nice to > be able to have declarations and any undeclared variable be flagged. To my knowledge, the closest to that is using __slots__ in class definitions. Many a time have I assigned to misspelled class members until I discovered __slots__. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
wrote: > Cameron, > > Your suggestion makes me shudder! Me, too > Removing all earlier lines of code is often guaranteed to generate errors as > variables you are using are not declared or initiated, modules are not > imported and so on. all of which aren't syntax errors, so the method should still work. Ugly as hell though. I can't think of a reason to want to find multiple syntax errors in a file. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 09/10/2022 10.49, Avi Gross wrote: Anton There likely are such programs out there but are there universal agreements on how to figure out when a new safe zone of code starts where error testing can begin? For example a file full of function definitions might find an error in function 1 and try to find the end of that function and resume checking the next function. But what if a function defines local functions within it? What if the mistake in one line of code could still allow checking the next line rather than skipping it all? My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. If you spell a variable name wrong when declaring it, a dozen uses of the right name may cause errors. Should you fix the first or change all later ones? How does one declare a variable in python? Sometimes it'd be nice to be able to have declarations and any undeclared variable be flagged. When I was writing F77 for a living, I'd (temporarily) put: IMPLICIT CHARACTER*3 at the beginning of a program or subroutine that I was modifying, in order to have any typos flagged. I'd love it if there was something similar that I could do in python. -- Michael F. Stemper 87.3% of all statistics are made up by the person giving them. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Tue, 11 Oct 2022 at 06:34, Peter J. Holzer wrote: > > On 2022-10-10 09:23:27 +1100, Chris Angelico wrote: > > On Mon, 10 Oct 2022 at 06:50, Antoon Pardon wrote: > > > I just want a parser that doesn't give up on encoutering the first syntax > > > error. Maybe do some semantic checking like checking the number of > > > parameters. > > > > That doesn't make sense though. > > I think you disagree with most compiler authors here. > > > It's one thing to keep going after finding a non-syntactic error, but > > an error of syntax *by definition* makes parsing the rest of the file > > dubious. > > Dubious but still useful. There's a huge difference between non-fatal errors and syntactic errors. The OP wants the parser to magically skip over a fundamental syntactic error and still parse everything else correctly. That's never going to work perfectly, and the OP is surprised at this. > > What would it even *mean* to not give up? > > Read the blog post on Lezer for some ideas: > https://marijnhaverbeke.nl/blog/lezer.html > > This is in the context of an editor. Incidentally, that's actually where I would expect to see that kind of feature show up the most - syntax highlighters will often be designed to "carry on, somehow" after a syntax error, even though it often won't make any sense (just look at what happens to your code highlighting when you omit a quote character). It still won't always be any use, but you do see *some* attempt at it. But if the OP would be satisfied with that, I rather doubt that this thread would even have happened. Unless, of course, the OP still lives in the dark ages when no text editor available had any suitable features for code highlighting. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-10 09:23:27 +1100, Chris Angelico wrote: > On Mon, 10 Oct 2022 at 06:50, Antoon Pardon wrote: > > I just want a parser that doesn't give up on encoutering the first syntax > > error. Maybe do some semantic checking like checking the number of > > parameters. > > That doesn't make sense though. I think you disagree with most compiler authors here. > It's one thing to keep going after finding a non-syntactic error, but > an error of syntax *by definition* makes parsing the rest of the file > dubious. Dubious but still useful. > What would it even *mean* to not give up? Read the blog post on Lezer for some ideas: https://marijnhaverbeke.nl/blog/lezer.html This is in the context of an editor. But the same problem applies to compilers. It's not very important if a compile run only takes a second or so but even then it might be helpful to see several error messages and not only one at a time. It becomes much more important as compile times get longer (as an extreme[1] example, when I worked on a largeish cobol program in the 1980s, compiling the thing took about half an hour. I really wanted to fix *everything* before starting the compiler again.) Marijn isn't the only person who revisited this problem recently[2]. I've read a few other blog posts and papers on that topic at about the same time. hp [1] Yes, there are programs where a full compile takes much longer than that. But you can usually get away with recompiling only a small part, so you don't have to wait that long during normal development. That cobol compiler couldn't do that. [2] "Recently" means "in the last 10 years or so". -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10Oct2022 09:04, Antoon Pardon wrote: It is easy to get the syntax right before submitting to such a pipeline. I usually run a linter on my code for serious commits, and I've got a `lint1` alias which basicly runs the short fast flavour of that which does a syntax check and the very fast less thorough lint phase. If you have a linter that doesn't quit after the first syntax error, please provide a link. I already tried pylint and it also quits after the first syntax error. I don't have such a linter. I did outline an approach for you to write one of your own by wrapping an existing parser program. I have a personal "lint" script which runs a few linters. The first check is `py_compile` which quits at the first syntax error. The other linters are not even tried if that fails. I do not know what your editing environment is; I'd have thought that some IDEs should make the first syntax error very obvious and easy to go to, and an obvious indication that the file as a whoe is syntacticly good/bad. If you have such, between them you could fairly easily resolve syntax errors rapidly, perhaps rapidly enough to make up for a stop-at-the-first-fail syntax check. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 10/10/2022 om 00:45 schreef Cameron Simpson: On 09Oct2022 21:46, Antoon Pardon wrote: Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines. Yes I find it onerous, especially since I have a pipeline with unit tests and other tools that all have to redo their work each time a bug is corrected. It is easy to get the syntax right before submitting to such a pipeline. I usually run a linter on my code for serious commits, and I've got a `lint1` alias which basicly runs the short fast flavour of that which does a syntax check and the very fast less thorough lint phase. If you have a linter that doesn't quit after the first syntax error, please provide a link. I already tried pylint and it also quits after the first syntax error. -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10Oct2022 00:41, avi.e.gr...@gmail.com wrote: Your suggestion makes me shudder! And fair enough too. I don't do this for me, I'm just suggesting an approach which might bring something to Antoon's objective. Removing all earlier lines of code is often guaranteed to generate errors as variables you are using are not declared or initiated, modules are not imported and so on. Antoon's interested in syntax errors. Removing just the line or three where the previous error happened would also have a good chance of invalidating something. Doubtless. He accepts that any such resume-the-parse can bring misleading error messages. Antoon is not expecting magic, just getting several complaints instead of just the first syntax error. I must admit I sympathise a bit, as one of my own major irks is command line tools which moan about the first bad option instead of noting it and moving on to complain about other things as well, then quitting after the command line parse. Pure laziness a lot of the time IMO; I've done it myself, but do like to make multiple complaints when it's feasible. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
RE: What to use for finding as many syntax errors as possible.
ackets? The compiler or interpreter often cannot fix it so it often tries to skip forward till it finds something unambiguous that mark the beginning of a new section. That might be something like an unquoted semicolon at the end of a line or a matching close bracket. Depending on such choices, again, varying amounts of the program may be ignored in evaluating what follows. But this is not the same as a human speedreading or daydreaming who misses a bit here and there and just hopes it was not crucial and that what follows probably remains worthy and valid. I have sometimes missed something like a name and then seen pages of pronouns like "she" and eventually give up as no more hints arrive and I have to go back or ask someone lest a big bunch of the text makes no sense to me. Someone is wanting to treat code from a spelling checker perspective and wants all possible mistakes thrown at them at once. As I pointed out, in real life many kinds of context can matter and a really good checker might even consult a personal list of words it has learned you want ignored, like people's names or some abbreviations like LOL. It may even read marked-up text in say HTML or XML or similar formats that is marked with the language they supposedly contain and calls up a spell-checker appropriate for each region. But if they want a really intelligent program that recovers enough from errors to reliably continue, maybe not easy. They have explained and amended that they understand some of these issues and are willing to get lots of false negatives or red herrings and their real goal is to have a chance to detect and maybe fix a few things per round rather than just one. Not a bad wish. Just not a trivial wish to grant and satisfy. -Original Message- From: Python-list On Behalf Of Cameron Simpson Sent: Sunday, October 9, 2022 6:45 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On 09Oct2022 21:46, Antoon Pardon wrote: >>Is it that onerous to fix one thing and run it again? It was once when >>you handed in punch cards and waited a day or on very busy machines. > >Yes I find it onerous, especially since I have a pipeline with unit >tests and other tools that all have to redo their work each time a bug >is corrected. It is easy to get the syntax right before submitting to such a pipeline. I usually run a linter on my code for serious commits, and I've got a `lint1` alias which basicly runs the short fast flavour of that which does a syntax check and the very fast less thorough lint phase. I say this just to ease your write/run-tests cycle. Regarding your main request, had you considered writing your own wrapper tool? Something which ran something like: python -We:invalid -m py_compile your_python_file.py If there's an error, report it, then make a new file commencing with the next unindented line after the error, with all preceeding lines commented out (to keep the line numbers the same). Then run the check again. Repeat until the file's empty or there are no errors. This doesn't sound very complex. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 10/9/2022 1:29 PM, Peter J. Holzer wrote: > On 2022-10-09 12:59:09 -0400, Thomas Passin wrote: >> https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it >> >> People seemed especially enthusiastic about the one-liner from jmd_dk. > > I don't think that one-liner solves Antoon's requirement of continuing > after an error. It uses just the normal python parser so it has exactly > the same limitations. Yes, of course. Interesting, though. py_compile tends to be what I use for a quick check. I linked to the page mostly for the other possibilities, as you mentioned below: > Some of the mentioned tools may do what Antoon wants, though. > > hp > > -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 09Oct2022 21:46, Antoon Pardon wrote: Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines. Yes I find it onerous, especially since I have a pipeline with unit tests and other tools that all have to redo their work each time a bug is corrected. It is easy to get the syntax right before submitting to such a pipeline. I usually run a linter on my code for serious commits, and I've got a `lint1` alias which basicly runs the short fast flavour of that which does a syntax check and the very fast less thorough lint phase. I say this just to ease your write/run-tests cycle. Regarding your main request, had you considered writing your own wrapper tool? Something which ran something like: python -We:invalid -m py_compile your_python_file.py If there's an error, report it, then make a new file commencing with the next unindented line after the error, with all preceeding lines commented out (to keep the line numbers the same). Then run the check again. Repeat until the file's empty or there are no errors. This doesn't sound very complex. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On Mon, 10 Oct 2022 at 06:50, Antoon Pardon wrote: > I just want a parser that doesn't give up on encoutering the first syntax > error. Maybe do some semantic checking like checking the number of parameters. That doesn't make sense though. It's one thing to keep going after finding a non-syntactic error, but an error of syntax *by definition* makes parsing the rest of the file dubious. What would it even *mean* to not give up? How should it interpret the following lines of code? All it can do is report the error. You know, if you'd not made this thread, the time you saved would have been enough for quite a few iterations of "fix one syntactic error, run it again to find the next". ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Am Sun, Oct 09, 2022 at 07:51:12PM +0200 schrieb Antoon Pardon: > >But the point is: you can't (there is no way to) be sure the > >9+ errors really are errors. > > > >Unless you further constrict what sorts of errors you are > >looking for and what margin of error or leeway for false > >positives you want to allow. > > Look when I was at the university we had to program in Pascal and > the compilor we used continued parsing until the end. Sure there > were times that after a number of reported errors the number of > false positives became so high it was useless trying to find the > remaining true ones, but it still was more efficient to correct the > obvious ones, than to only correct the first one. > > I don't need to be sure. Even the occasional wrong correction > is probably still more efficient than quiting after the first > syntax error. A-ha, so you further defined your context. Under which I can agree to the objective :-) Best, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
> On 9 Oct 2022, at 18:54, Antoon Pardon wrote: > > > > Op 9/10/2022 om 19:23 schreef Karsten Hilbert: >> Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon: >> >>> Op 9/10/2022 om 17:49 schreef Avi Gross: My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. >>> At this moment I would prefer a tool that reported 100 errors, which would >>> allow me to easily correct 10 real errors, over the python strategy which >>> quits >>> after having found one syntax error. >> But the point is: you can't (there is no way to) be sure the >> 9+ errors really are errors. >> >> Unless you further constrict what sorts of errors you are >> looking for and what margin of error or leeway for false >> positives you want to allow. > > Look when I was at the university we had to program in Pascal and > the compilor we used continued parsing until the end. Sure there > were times that after a number of reported errors the number of > false positives became so high it was useless trying to find the > remaining true ones, but it still was more efficient to correct the > obvious ones, than to only correct the first one. If it’s very fast to syntax check then one at a time is fine. Python is very fast to syntax check so I personal do not need the multi error version. My editor has syntax check on a key and it’s instant to drop me a syntax error. Barry > > I don't need to be sure. Even the occasional wrong correction > is probably still more efficient than quiting after the first > syntax error. > > -- > Antoon. > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 9/10/2022 om 21:44 schreef Avi Gross: But an error like setting the size of a fixed length data structure to the right size may result in oodles of errors about being out of range that magically get fixed by one change. Sometimes too much info just gives you a headache. So? The user of such a tool doesn't need to go through all the provided information. If after correcting a few errors, the users find the rest of the information gives him a headache, he can just ignore all that and just run a new iteration. -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-09 15:18:19 -0400, Avi Gross wrote: > Antoon, it may also relate to an interpreter versus compiler issue. > > Something like a compiler for C does not do anything except write code in > an assembly language. It can choose to keep going after an error and start > looking some more from a less stable place. > > Interpreters for Python have to catch interrupts as they go and often run > code in small batches. Continuing to evaluate after an error could cause > weird effects. I don't think this is really an issue. A python file is completely compiled to byte code before execution starts. It's true that a syntax error before an import prevents that import, but since imports are usually at the start of a file, a syntax error will only rarely prevent the import (and files intended to be imported generally don't have weird side effects anyway). One issue is could be that compilers which generate executables are generally thorough and slow, while the compilers which generate byte-code for immediate consumption by an interpreter are generally simple and fast. So there is more incentive for the former to discover as many errors as possible and they are also better equipped to do this. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 9/10/2022 om 21:18 schreef Avi Gross: Antoon, it may also relate to an interpreter versus compiler issue. Something like a compiler for C does not do anything except write code in an assembly language. It can choose to keep going after an error and start looking some more from a less stable place. Interpreters for Python have to catch interrupts as they go and often run code in small batches. Continuing to evaluate after an error could cause weird effects. So what you want is closer to a lint program that does not run code at all, or merely writes pseudocode to a file to be run faster later. I just want a parser that doesn't give up on encoutering the first syntax error. Maybe do some semantic checking like checking the number of parameters. I will say that often enough a program could report more possible errors. Putting your code into multiple files and modules may mean you could cleanly evaluate the code and return multiple errors from many modules as long as they are distinct. Finding all errors is not possible if recovery from one is not guaranteed. I don't need it to find all errors. As long as it reasonably accuratly finds a significant number of them. Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines. Yes I find it onerous, especially since I have a pipeline with unit tests and other tools that all have to redo their work each time a bug is corrected. -- Antoon. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 9/10/2022 om 21:18 schreef Avi Gross: Antoon, it may also relate to an interpreter versus compiler issue. Something like a compiler for C does not do anything except write code in an assembly language. It can choose to keep going after an error and start looking some more from a less stable place. Interpreters for Python have to catch interrupts as they go and often run code in small batches. Continuing to evaluate after an error could cause weird effects. So what you want is closer to a lint program that does not run code at all, or merely writes pseudocode to a file to be run faster later. I just want a parser that doesn't give up on encoutering the first syntax error. Maybe do some semantic checking like checking the number of parameters. I will say that often enough a program could report more possible errors. Putting your code into multiple files and modules may mean you could cleanly evaluate the code and return multiple errors from many modules as long as they are distinct. Finding all errors is not possible if recovery from one is not guaranteed. I don't need it to find all errors. As long as it reasonably accuratly finds a significant number of them. Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines. Yes I find it onerous, especially since I have a pipeline with unit tests and other tools that all have to redo their work each time a bug is corrected. -- Antoon. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
I will say that those of us meaning me, who express reservations are not arguing it is a bad idea to get more info in one sweep. Many errors come in bunches. If I keep calling some function with the wrong number or type of arguments, it may be the same in a dozen places in my code. The first error report may make me search for the others places so I fix it all at once. Telling me where some instances are might speed that a bit. As long as it is understood that further errors are a heuristic and possibly misleading, fine. But an error like setting the size of a fixed length data structure to the right size may result in oodles of errors about being out of range that magically get fixed by one change. Sometimes too much info just gives you a headache. But a tool like you described could have uses even if imperfect. If you are teaching a course and students submit programs, could you grade the one with a single error higher than one with 5 errors shown imperfectly and fail the one with 600? On Sun, Oct 9, 2022, 1:53 PM Antoon Pardon wrote: > > > Op 9/10/2022 om 19:23 schreef Karsten Hilbert: > > Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon: > > > >> Op 9/10/2022 om 17:49 schreef Avi Gross: > >>> My guess is that finding 100 errors might turn out to be misleading. > If you > >>> fix just the first, many others would go away. > >> At this moment I would prefer a tool that reported 100 errors, which > would > >> allow me to easily correct 10 real errors, over the python strategy > which quits > >> after having found one syntax error. > > But the point is: you can't (there is no way to) be sure the > > 9+ errors really are errors. > > > > Unless you further constrict what sorts of errors you are > > looking for and what margin of error or leeway for false > > positives you want to allow. > > Look when I was at the university we had to program in Pascal and > the compilor we used continued parsing until the end. Sure there > were times that after a number of reported errors the number of > false positives became so high it was useless trying to find the > remaining true ones, but it still was more efficient to correct the > obvious ones, than to only correct the first one. > > I don't need to be sure. Even the occasional wrong correction > is probably still more efficient than quiting after the first > syntax error. > > -- > Antoon. > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Antoon, it may also relate to an interpreter versus compiler issue. Something like a compiler for C does not do anything except write code in an assembly language. It can choose to keep going after an error and start looking some more from a less stable place. Interpreters for Python have to catch interrupts as they go and often run code in small batches. Continuing to evaluate after an error could cause weird effects. So what you want is closer to a lint program that does not run code at all, or merely writes pseudocode to a file to be run faster later. Many languages now have blocks of code that are not really be evaluated till later. Some code is built on the fly. And some errors are not errors at first. Many languages let you not declare a variable before using it or allow it to change types. In some, the text is lazily evaluated as late as possible. I will say that often enough a program could report more possible errors. Putting your code into multiple files and modules may mean you could cleanly evaluate the code and return multiple errors from many modules as long as they are distinct. Finding all errors is not possible if recovery from one is not guaranteed. Take a language that uses a semicolon to end a statement. If absent usually there would be some error but often something on the next line. Your evaluator could do an experiment and add a semicolon and try again. This might work 90% of the time but sometimes the error was not ending the line with a backslash to make it continue properly, or an indentation issue and even spelling error. No guarantees. Is it that onerous to fix one thing and run it again? It was once when you handed in punch cards and waited a day or on very busy machines. On Sun, Oct 9, 2022, 1:03 PM Antoon Pardon wrote: > > > Op 9/10/2022 om 17:49 schreef Avi Gross: > > My guess is that finding 100 errors might turn out to be misleading. If > you > > fix just the first, many others would go away. > > At this moment I would prefer a tool that reported 100 errors, which would > allow me to easily correct 10 real errors, over the python strategy which > quits > after having found one syntax error. > > -- > Antoon. > > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-09 18:51, Antoon Pardon wrote: Op 9/10/2022 om 19:23 schreef Karsten Hilbert: Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon: Op 9/10/2022 om 17:49 schreef Avi Gross: My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits after having found one syntax error. But the point is: you can't (there is no way to) be sure the 9+ errors really are errors. Unless you further constrict what sorts of errors you are looking for and what margin of error or leeway for false positives you want to allow. Look when I was at the university we had to program in Pascal and the compilor we used continued parsing until the end. Sure there were times that after a number of reported errors the number of false positives became so high it was useless trying to find the remaining true ones, but it still was more efficient to correct the obvious ones, than to only correct the first one. I don't need to be sure. Even the occasional wrong correction is probably still more efficient than quiting after the first syntax error. When I did some programming in COBOL, a single omitted "." would completely confuse the compiler and it was best to fix that one error and then try again. On the other hand, TurboPascal would also stop on the first error and put the cursor at the error position in the IDE, but as it compiled quickly, it wasn't a problem. It was no slower than it would've been if it had found multiple errors and you pressed a key to advance to the next error. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
PyCharm. Does a good job of separating these are really errors from do you really mean that warnings from this word is spelled right. https://www.jetbrains.com/pycharm/ From: Python-list on behalf of Antoon Pardon Date: Sunday, October 9, 2022 at 6:11 AM To: python-list@python.org Subject: What to use for finding as many syntax errors as possible. *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** I would like a tool that tries to find as many syntax errors as possible in a python file. I know there is the risk of false positives when a tool tries to recover from a syntax error and proceeds but I would prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style enforcements. Any recommandations? -- Antoon Pardon -- https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kxDZilNf74VILuntVEzVZ4Wjv6RPr4JUbGpWrURDJ3CtDNAi9szBWweqrDM-uHy-o_Sncgrm2BmJIRksmxSG_LGVbBU$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kxDZilNf74VILuntVEzVZ4Wjv6RPr4JUbGpWrURDJ3CtDNAi9szBWweqrDM-uHy-o_Sncgrm2BmJIRksmxSG_LGVbBU$> -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 9/10/2022 om 19:23 schreef Karsten Hilbert: Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon: Op 9/10/2022 om 17:49 schreef Avi Gross: My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits after having found one syntax error. But the point is: you can't (there is no way to) be sure the 9+ errors really are errors. Unless you further constrict what sorts of errors you are looking for and what margin of error or leeway for false positives you want to allow. Look when I was at the university we had to program in Pascal and the compilor we used continued parsing until the end. Sure there were times that after a number of reported errors the number of false positives became so high it was useless trying to find the remaining true ones, but it still was more efficient to correct the obvious ones, than to only correct the first one. I don't need to be sure. Even the occasional wrong correction is probably still more efficient than quiting after the first syntax error. -- Antoon. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-09 19:23:41 +0200, Karsten Hilbert wrote: > Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon: > > Op 9/10/2022 om 17:49 schreef Avi Gross: > > >My guess is that finding 100 errors might turn out to be misleading. If you > > >fix just the first, many others would go away. > > > > At this moment I would prefer a tool that reported 100 errors, which would > > allow me to easily correct 10 real errors, over the python strategy which > > quits > > after having found one syntax error. > > But the point is: you can't (there is no way to) be sure the > 9+ errors really are errors. As a human who knows Python in many cases you can be sure. Sometimes you aren't sure, then you leave that one for the next iteration. No big deal. This isn't the 1960s when you sent your punched cards in and got the result back next week. So neither the parser nor you need to be perfect. Just better than one error at a time. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-09 12:59:09 -0400, Thomas Passin wrote: > https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it > > People seemed especially enthusiastic about the one-liner from jmd_dk. I don't think that one-liner solves Antoon's requirement of continuing after an error. It uses just the normal python parser so it has exactly the same limitations. Some of the mentioned tools may do what Antoon wants, though. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon: > Op 9/10/2022 om 17:49 schreef Avi Gross: > >My guess is that finding 100 errors might turn out to be misleading. If you > >fix just the first, many others would go away. > > At this moment I would prefer a tool that reported 100 errors, which would > allow me to easily correct 10 real errors, over the python strategy which > quits > after having found one syntax error. But the point is: you can't (there is no way to) be sure the 9+ errors really are errors. Unless you further constrict what sorts of errors you are looking for and what margin of error or leeway for false positives you want to allow. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it People seemed especially enthusiastic about the one-liner from jmd_dk. On 10/9/2022 12:17 PM, Peter J. Holzer wrote: On 2022-10-09 12:09:17 +0200, Antoon Pardon wrote: I would like a tool that tries to find as many syntax errors as possible in a python file. I know there is the risk of false positives when a tool tries to recover from a syntax error and proceeds but I would prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style enforcements. Any recommandations? There seems to have been increased interest in good error recovery over the last years. I thought I had bookmarked a bunch of projects, but the only one I can find right now is Lezer (https://marijnhaverbeke.nl/blog/lezer.html) which is part of the CodeMirror (https://codemirror.net/) editor. Python is listed as a currently supported language, so you might want to check that out. Disclaimer: I haven't used CodeMirror, so I can't say anything about its quality. The blog entry about Lezer was interesting, though. hp -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Op 9/10/2022 om 17:49 schreef Avi Gross: My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. At this moment I would prefer a tool that reported 100 errors, which would allow me to easily correct 10 real errors, over the python strategy which quits after having found one syntax error. -- Antoon. -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
On 2022-10-09 12:09:17 +0200, Antoon Pardon wrote: > I would like a tool that tries to find as many syntax errors as possible in > a python file. I know there is the risk of false positives when a tool tries > to recover from a syntax error and proceeds but I would prefer that over the > current python strategy of quiting after the first syntax error. I just want > a tool for syntax errors. No style enforcements. Any recommandations? There seems to have been increased interest in good error recovery over the last years. I thought I had bookmarked a bunch of projects, but the only one I can find right now is Lezer (https://marijnhaverbeke.nl/blog/lezer.html) which is part of the CodeMirror (https://codemirror.net/) editor. Python is listed as a currently supported language, so you might want to check that out. Disclaimer: I haven't used CodeMirror, so I can't say anything about its quality. The blog entry about Lezer was interesting, though. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: What to use for finding as many syntax errors as possible.
Anton There likely are such programs out there but are there universal agreements on how to figure out when a new safe zone of code starts where error testing can begin? For example a file full of function definitions might find an error in function 1 and try to find the end of that function and resume checking the next function. But what if a function defines local functions within it? What if the mistake in one line of code could still allow checking the next line rather than skipping it all? My guess is that finding 100 errors might turn out to be misleading. If you fix just the first, many others would go away. If you spell a variable name wrong when declaring it, a dozen uses of the right name may cause errors. Should you fix the first or change all later ones? On Sun, Oct 9, 2022, 6:11 AM Antoon Pardon wrote: > I would like a tool that tries to find as many syntax errors as possible > in a python file. I know there is the risk of false positives when a > tool tries to recover from a syntax error and proceeds but I would > prefer that over the current python strategy of quiting after the first > syntax error. I just want a tool for syntax errors. No style > enforcements. Any recommandations? -- Antoon Pardon > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
What to use for finding as many syntax errors as possible.
I would like a tool that tries to find as many syntax errors as possible in a python file. I know there is the risk of false positives when a tool tries to recover from a syntax error and proceeds but I would prefer that over the current python strategy of quiting after the first syntax error. I just want a tool for syntax errors. No style enforcements. Any recommandations? -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list