Cameron, or OP if you prefer, I think by now you have seen a suggestion that languages make choices and highly structured ones can be easier to "recover" from errors and try to continue than some with way more complex possibilities that look rather unstructured.
What is the error in code like this? A,b,c,d = 1,2, Or is it an error at all? Many languages have no concept of doing anything like the above and some tolerate a trailing comma and some set anything not found to some form of NULL or uninitialized and some ... If you look at human language, some are fairly simple and some are way too organized. But in a way it can make sense. Languages with gender will often ask you to change the spelling and often how you pronounce things not only based on whether a noun is male/female or even neuter but also insist you change the form of verbs or adjectives and so on that in effect give multiple signals that all have to line up to make a valid and understandable sentence. Heck, in conversations, people can often leave out parts of a sentence such as whether you are talking about "I" or "you" or "she" or "we" because the rest of the words in the sentence redundantly force only one choice to be possible. So some such annoying grammars (in my opinion) are error detection/correction codes in disguise. In days before microphones and speakers, it was common to not hear people well, like on a stage a hundred feet away with other ambient noises. Missing a word or two might still allow you to get the point as other parts of the sentence did such redundancies. Many languages have similar strictures letting you know multiple times if something is singular or plural. And I think another reason was what I call stranger detection. People who learn some vocabulary might still not speak correctly and be identifiable as strangers, as in spies. Do we need this in the modern age? Who knows! But it makes me prefer some languages over others albeit other reasons may ... With the internet today, we are used to expecting error correction to come for free. Do you really need one of every 8 bits to be a parity bit, which only catches may half of the errors, when the internals of your computer are relatively error free and even the outside is protected by things like various protocols used in making and examining packets and demanding some be sent again if some checksum does not match? Tons of checking is built in so at your level you rarely think about it. If you get a message, it usually is either 99.9999% accurate, or you do not have it shown to you at all. I am not talking about SPAM but about errors of transmission. So my analogies are that if you want a very highly structured language that can recover somewhat from errors, Python may not be it. And over the years as features are added or modified, the structure tends to get more complex. And R is not alone. Many surviving languages continue to evolve and borrow from each other and any program that you run today that could partially recover and produce pages of possible errors, may blow up when new features are introduced. And with UNICODE, the number of possible "errors" in what is placed in code for languages like Julia that allow them in most places ... -----Original Message----- From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On Behalf Of Cameron Simpson Sent: Monday, October 10, 2022 6:17 PM To: python-list@python.org Subject: Re: What to use for finding as many syntax errors as possible. On 11Oct2022 08:02, Chris Angelico <ros...@gmail.com> wrote: >There's a huge difference between non-fatal errors and syntactic >errors. The OP wants the parser to magically skip over a fundamental >syntactic error and still parse everything else correctly. That's never >going to work perfectly, and the OP is surprised at this. The OP is not surprised by this, and explicitly expressed awareness that resuming a parse had potential for "misparsing" further code. I remain of the opinion that one could resume a parse at the next unindented line and get reasonable results a lot of the time. In fact, I expect that one could resume tokenising at almost any line which didn't seem to be inside a string and often get reasonable results. I grew up with C and Pascal compilers which would _happily_ produce many complaints, usually accurate, and all manner of syntactic errors. They didn't stop at the first syntax error. All you need in principle is a parser which goes "report syntax error here, continue assuming <some state>". For Python that might mean "pretend a missing final colon" or "close open brackets" etc, depending on the context. If you make conservative implied corrections you can get a reasonable continued parse, enough to find further syntax errors. I remember the Pascal compiler in particular had a really good "you missed a semicolon _back there_" mode which was almost always correct, a nice boon when correcting mistakes. Cheers, Cameron Simpson <c...@cskk.id.au> -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list