[Python-ideas] Re: Regex timeouts
Chris Angelico writes: > On Wed, 16 Feb 2022 at 01:54, Stephen J. Turnbull > wrote: > > That is, all regexp implementations support the same basic > > language which is sufficient for most tasks most programmers want > > regexps for. > > The problem is that that's an illusion. It isn't for me. I write a lot of regexps for several languages (Python, grep, sed, Emacs Lisp), I rarely have to debug one, and in a year there may be one that debugging requires more than reading each character out loud and recognizing a typo. As a sometime Emacsen dev, I also do a fair amount of debugging of other people's regexps. Yuck! But it's almost always the case that (modulo efficiency considerations) it's pretty easy to figure out what they *want*, and rewrite the code (*not* the *regexp(s)*!) to use simpler regexps (usually parts of the original) in a more controlled way. > If you restrict yourself to the subset that's supported by every > regexp implementation, you'll quickly find tasks that you can't > handle. That's true of everything in programming, there are no tools that can handle everything until you have a Turing-complete programming language, and even then, practically there are always things that are too painful even for masochists to do in that language. But with regexps, I don't, you see. Besides regexps, I write a lot of (trivial to simple) parsers. For the regexps, I don't need much more than ()[]+*?.\s\d\t\r\n most of the time (and those last 3 are due to tab-separated value files and RFC 822, special cases). I could probably use scanf (except that Python, sed, and Emacs Lisp don't have it ;-) but for the lack of []. Occasionally for things like date parsing and other absolutely fixed-field contexts I'll use {}. I do sanity-checking on the result frequently. If the regexp engine supports it, I'll use named groups and other such syntactic sugar. In a general purpose programming language, if it supports "literate regexps", I use those, if not, I use separate strings (which also makes it easy to back out of a large regexp into statements if I need to establish backtracking boundaries). Sure, if you want to do all of that *in* a single regexp, you are indeed going to run into things you can't do that way. When I wrote, "what people want to do" I meant tasks where regexps could do a lot of the task, but not that they could do the whole thing in one regexp. For my style, regexps are something that's available in a very wide array of contexts, and consistent enough to get the job done. I treat complex regexps the way I treat C extensions: only if the performance benefit is human-perceptible, which is pretty rare. > "why are other things not ALSO popular". I honestly think that scanf > parsing, if implemented ad-hoc by different programming languages and > extended to their needs, would end up ... becoming regexp-like, and not just in the sense of > no less different from each other than different regexp engines are > - the most-used parts would also be the most-compatible, just like > with regexps. ;-) What I think is more interesting than simpler (but more robust for what they can do) facilities is better parser support in standard libraries (not just Python's), and more use of them in place of hand-written "parsers" that just eat tokens defined by regexps in order. If one could, for example, write [ "Sun|Mon|Tue|Wed|Thu|Fri|Sat" : dow, ", ". "(?: |\d)\d)" : day, " ", "Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec" : month, " ", "\d\d\d\d" : year, " ", "\d\d:\d\d:\d\d" : time, " ", "[+-]\d\d\d\d" : tzoffset ] (which is not legal Python syntax but I'm too lazy to try to come up with something better) to parse an RFC 822 date, I think people would use that. Sure, for something *that* regular, most people would probably use the evident "literate" regexp with named groups, but it wouldn't take much complexity to make such a parser generator worthwhile to programmers. Steve ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZAZBT2FL7R5ADNXJPFFMJ5MPEZQEFPBK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
On Wed, 16 Feb 2022 at 21:01, Stephen J. Turnbull wrote: > > Chris Angelico writes: > > On Wed, 16 Feb 2022 at 01:54, Stephen J. Turnbull > > wrote: > > > > That is, all regexp implementations support the same basic > > > language which is sufficient for most tasks most programmers want > > > regexps for. > > > > The problem is that that's an illusion. > > It isn't for me. I write a lot of regexps for several languages > (Python, grep, sed, Emacs Lisp), I rarely have to debug one, and in a > year there may be one that debugging requires more than reading each > character out loud and recognizing a typo. I've used simple regexps in sed and grep, and found differences about what needs to be escaped, so even when you don't use the advanced features, you need to be aware of them. > But with regexps, I don't, you see. Besides regexps, I write a lot of > (trivial to simple) parsers. For the regexps, I don't need much more > than ()[]+*?.\s\d\t\r\n most of the time (and those last 3 are due to > tab-separated value files and RFC 822, special cases). I could > probably use scanf (except that Python, sed, and Emacs Lisp don't have > it ;-) but for the lack of []. Occasionally for things like date > parsing and other absolutely fixed-field contexts I'll use {}. I do > sanity-checking on the result frequently. Not sure what you mean by "lack of []", but some scanf variants do support that - for instance, %[a-z] will only match lowercase alpha. > If the regexp engine supports it, I'll use named groups and other such > syntactic sugar. In a general purpose programming language, if it > supports "literate regexps", I use those, if not, I use separate > strings (which also makes it easy to back out of a large regexp into > statements if I need to establish backtracking boundaries). That's what I mean about the illusion. You can't use named groups in all regexp engines. > > "why are other things not ALSO popular". I honestly think that scanf > > parsing, if implemented ad-hoc by different programming languages and > > extended to their needs, would end up > > ... becoming regexp-like, and not just in the sense of > > > no less different from each other than different regexp engines are > > - the most-used parts would also be the most-compatible, just like > > with regexps. > > ;-) Heh, probably true :) > What I think is more interesting than simpler (but more robust for > what they can do) facilities is better parser support in standard > libraries (not just Python's), and more use of them in place of > hand-written "parsers" that just eat tokens defined by regexps in > order. If one could, for example, write > > [ "Sun|Mon|Tue|Wed|Thu|Fri|Sat" : dow, > ", ". > "(?: |\d)\d)" : day, > " ", > "Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec" : month, > " ", > "\d\d\d\d" : year, > " ", > "\d\d:\d\d:\d\d" : time, > " ", > "[+-]\d\d\d\d" : tzoffset ] > > (which is not legal Python syntax but I'm too lazy to try to come up > with something better) to parse an RFC 822 date, I think people would > use that. Sure, for something *that* regular, most people would > probably use the evident "literate" regexp with named groups, but it > wouldn't take much complexity to make such a parser generator > worthwhile to programmers. > That's an interesting concept. I can imagine writing it declaratively like this: class Date(parser): dow: "Sun|Mon|Tue|Wed|Thu|Fri|Sat" _: ", " day: "(?: |\d)\d)" _: " " month: "Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec" _: " " year: "\d\d\d\d" _: " " time: "\d\d:\d\d:\d\d" _: " " tzoffset: "[+-]\d\d\d\d" Would it be better than a plain regex? Not sure. ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6DJ35KUEGN4Y4GD6YPUNWVE2ZDR6QFAY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
On Wed, 16 Feb 2022 at 10:23, Chris Angelico wrote: > > On Wed, 16 Feb 2022 at 21:01, Stephen J. Turnbull > wrote: > > What I think is more interesting than simpler (but more robust for > > what they can do) facilities is better parser support in standard > > libraries (not just Python's), and more use of them in place of > > hand-written "parsers" that just eat tokens defined by regexps in > > order. If one could, for example, write > > > > [ "Sun|Mon|Tue|Wed|Thu|Fri|Sat" : dow, > > ", ". > > "(?: |\d)\d)" : day, > > " ", > > "Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec" : month, > > " ", > > "\d\d\d\d" : year, > > " ", > > "\d\d:\d\d:\d\d" : time, > > " ", > > "[+-]\d\d\d\d" : tzoffset ] > > > > (which is not legal Python syntax but I'm too lazy to try to come up > > with something better) to parse an RFC 822 date, I think people would > > use that. Sure, for something *that* regular, most people would > > probably use the evident "literate" regexp with named groups, but it > > wouldn't take much complexity to make such a parser generator > > worthwhile to programmers. > > > > That's an interesting concept. I can imagine writing it declaratively like > this: > > class Date(parser): > dow: "Sun|Mon|Tue|Wed|Thu|Fri|Sat" > _: ", " > day: "(?: |\d)\d)" I find it mildly amusing that even this "better" solution fell victim to an incorrect regexp ;-) However, I do like the idea of having a better parser library in the stdlib. But it's pretty easy to write such a thing and publish it on PyPI, so the lack of an obvious "best in class" answer for this problem suggests that people would be less likely to use such a feature than we're assuming. The two obvious examples on PyPI are: 1. PyParsing - https://pypi.org/project/pyparsing/. To me, this has the feel of the sort of functional approach SNOBOL used. 2. parse - https://pypi.org/project/parse/. A scanf-style approach inspired by format rather than printf. Do people choose regexes over these because re is in the stdlib? Are they simply less well known? Or is there an attraction to regexes that makes people prefer them in spite of the complexity/maintainability issues? Paul ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MAVVUR2L3GKXJLAXZUULSFQT5PDCZWMI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
On Wed, Feb 16, 2022, 5:46 AM Paul Moore wrote: > On Wed, 16 Feb 2022 at 10:23, Chris Angelico wrote: > > > > On Wed, 16 Feb 2022 at 21:01, Stephen J. Turnbull > > wrote: > > > > What I think is more interesting than simpler (but more robust for > > > what they can do) facilities is better parser support in standard > > > libraries (not just Python's), and more use of them in place of > > > hand-written "parsers" that just eat tokens defined by regexps in > > > order. If one could, for example, write > > > > > > [ "Sun|Mon|Tue|Wed|Thu|Fri|Sat" : dow, > > > ", ". > > > "(?: |\d)\d)" : day, > > > " ", > > > "Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec" : month, > > > " ", > > > "\d\d\d\d" : year, > > > " ", > > > "\d\d:\d\d:\d\d" : time, > > > " ", > > > "[+-]\d\d\d\d" : tzoffset ] > > > > > > (which is not legal Python syntax but I'm too lazy to try to come up > > > with something better) to parse an RFC 822 date, I think people would > > > use that. Sure, for something *that* regular, most people would > > > probably use the evident "literate" regexp with named groups, but it > > > wouldn't take much complexity to make such a parser generator > > > worthwhile to programmers. > > > > > > > That's an interesting concept. I can imagine writing it declaratively > like this: > > > > class Date(parser): > > dow: "Sun|Mon|Tue|Wed|Thu|Fri|Sat" > > _: ", " > > day: "(?: |\d)\d)" > > I find it mildly amusing that even this "better" solution fell victim > to an incorrect regexp ;-) > > However, I do like the idea of having a better parser library in the > stdlib. But it's pretty easy to write such a thing and publish it on > PyPI, so the lack of an obvious "best in class" answer for this > problem suggests that people would be less likely to use such a > feature than we're assuming. > > The two obvious examples on PyPI are: > > 1. PyParsing - https://pypi.org/project/pyparsing/. To me, this has > the feel of the sort of functional approach SNOBOL used. > 2. parse - https://pypi.org/project/parse/. A scanf-style approach > inspired by format rather than printf. > > Do people choose regexes over these because re is in the stdlib? Are > they simply less well known? Or is there an attraction to regexes that > makes people prefer them in spite of the complexity/maintainability > issues? > > Paul Long story below but TLDR: I tried to use parse for a task I worked on for a long time, eventually had to learn regex. After using regex somewhat regularly for a while now I concluded the power and ubiquity of it is worth the additional cognitive load (parse only requires one to be familiar with standard python string format syntax). Story: The first task I set about trying to do in python (with no practical programming experience except for a single semester of c++ as part of a civil engineering curriculum) was a tool to convert 2D finite element mesh files into a file format for a niche finite element analysis program (the program is called CANDE; it's for analysis of buried culverts and pipes). My predecessor was creating these meshes by hand. He would literally get a 24"x36" of drafting paper and draw out his mesh and number the nodes and elements and enter the data into the text file. It took me eons to write something (probably 6 years!), I probably started over from scratch at least 7, maybe 10 times. And even after all that while I finally did arrive at something usable for myself, I never achieved my goal of being able to package something up I can pass on to my other colleague. When a new mesh has to be created they just ask me to do it (I still do them occasionally). Anyway all that to say: I remember trying to avoid learning regex for about 4 years of this. It looked too scary. One day I finally ran into a task that parse: https://pypi.org/project/parse/ ...which I was relying heavily on, couldn't handle. I researched it and am pretty confident that even in my relative ignorance I am/was correct about it but being able to do it (I am racking my brain but can't remember what that need was). This prompted me to FINALLY do a few regex tutorials and watch some pycon videos and I came out the other end realizing that, hey, regex isn't so bad. And it's so darn powerful that the trade off between it being an uphill task to learn and read (at least as first) and what you are able to do with it seems worth it. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KC2K6AX2P224SDSI6CB3P7LPWBU7M7L6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Creating ranges with ellipsis
This might be a silly idea but, would it be a good idea to have ...[a:b:c] return a range(a, b, c)? ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IBXS5ZHI2XTAKUFLOE2UWFBPCX7HUW75/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating a class template file generator inside Python language
Thanks for the previous mail. I would like to highlight the value my thought offers to the developers. They often have to write classes while writing a module. In the class, they are supposed to write classes and objects. In a class, typically the attributes are private, and we have getters and setters to access and modify them. My thought is to automate the process of creating such class files with command line. For this, I have made my initial attempt to create a module that does so. Here is a short video in which I am demonstrating it. I am also attaching the details of my terminal to get a bit of more details. I am looking forward to hearing from you if I should work on integrating it within the Python Language features, or I should go some other way round, or this is not a very useful tool for many people. *avanishcodes@avanishcodes*:*~*$ echo "No Student.py file here" No Student.py file here*avanishcodes@avanishcodes*:*~*$ ls*Academics* *Documents* *node_modules* *Pictures* *snap* *Videos**bin**Downloads* package.json *Projects* Student.py*Desktop**Music* package-lock.json *Public* *Templates**avanishcodes@avanishcodes*:*~*$ rm Student.py *avanishcodes@avanishcodes*:*~*$ ls*Academics* *Documents* *node_modules* *Pictures* *snap**bin**Downloads* package.json *Projects* *Templates**Desktop**Music* package-lock.json *Public* *Videos**avanishcodes@avanishcodes*:*~*$ pip install classgen Requirement already satisfied: classgen in ./.local/lib/python3.8/site-packages (0.0.7)*avanishcodes@avanishcodes*:*~*$ python3 -m classgen Student member1,member2,member3,member4,member5,member6 Class Student generated successfully.*avanishcodes@avanishcodes*:*~*$ cat Student.py #(class) Student class Student: """ This class is used to represent a Student. Attributes: member1: The member1 of the Student. member2: The member2 of the Student. member3: The member3 of the Student. member4: The member4 of the Student. member5: The member5 of the Student. member6: The member6 of the Student. Methods: get_member1(self): Gets the member1 of the Student. get_member2(self): Gets the member2 of the Student. get_member3(self): Gets the member3 of the Student. get_member4(self): Gets the member4 of the Student. get_member5(self): Gets the member5 of the Student. get_member6(self): Gets the member6 of the Student. set_member1(self, member1): Sets the member1 of the Student. set_member2(self, member2): Sets the member2 of the Student. set_member3(self, member3): Sets the member3 of the Student. set_member4(self, member4): Sets the member4 of the Student. set_member5(self, member5): Sets the member5 of the Student. set_member6(self, member6): Sets the member6 of the Student. """ _member1: None _member2: None _member3: None _member4: None _member5: None _member6: None def __init__(self, member1, member2, member3, member4, member5, member6): """ Initializes a Student object. Params: member1: The member1 of the Student. member2: The member2 of the Student. member3: The member3 of the Student. member4: The member4 of the Student. member5: The member5 of the Student. member6: The member6 of the Student. """ self._member1 = member1 self._member2 = member2 self._member3 = member3 self._member4 = member4 self._member5 = member5 self._member6 = member6 def get_member1(self): """ Gets the member1 of the Student. Returns: get_member1: The member1 of the Student. """ return self._member1 def get_member2(self): """ Gets the member2 of the Student. Returns: get_member2: The member2 of the Student. """ return self._member2 def get_member3(self): """ Gets the member3 of the Student. Returns: get_member3: The member3 of the Student. """ return self._member3 def get_member4(self): """ Gets the member4 of the Student. Returns: get_member4: The member4 of the Student. """ return self._member4 def get_member5(self): """ Gets the member5 of the Student. Returns: get_member5: The member5 of the Student. """ return self._member5 def get_member6(self): """ Gets the member6 of the Student. Returns: get_member6: The member6 of the Student. """ return self._member6 def set_member1(self, member1): """ Sets the member1 of the Student. Params: member1: The member1 of the
[Python-ideas] Re: Creating ranges with ellipsis
On Wed, Feb 16, 2022 at 09:44:07AM -0300, Soni L. wrote: > This might be a silly idea but, would it be a good idea to have > ...[a:b:c] return a range(a, b, c)? Similar ideas have been suggested before: https://mail.python.org/archives/list/python-ideas@python.org/thread/W44PPBJJXETTBQHWCMJB3DRCD6CTXWJT/ https://bugs.python.org/issue42956 What benefit do you see in writing [a:b:c] instead of range(a, b, c)? -- Steve ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/26HCWBERDZKAC5H2A2PAOLIMMPTDPIZW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating a class template file generator inside Python language
On Tue, Feb 15, 2022 at 03:43:54PM +0530, Avanish Gupta wrote: > I would like to highlight the value my thought offers to the developers. > They often have to write classes while writing a module. In the class, they > are supposed to write classes and objects. In a class, typically the > attributes are private, and we have getters and setters to access and > modify them. If developers are writing classes with lots of getters and setters, they probably think they are writing Java instead of Python. https://dirtsimple.org/2004/12/python-is-not-java.html I think you have done the right thing to put your code on PyPI. We'll wait to see whether people find it useful. -- Steve ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4NAAX4YD4M4T6ESKLQLLXTLEJKU74QZG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating a class template file generator inside Python language
On Thu, 17 Feb 2022 at 00:18, Avanish Gupta wrote: > > Thanks for the previous mail. > I would like to highlight the value my thought offers to the developers. They > often have to write classes while writing a module. In the class, they are > supposed to write classes and objects. In a class, typically the attributes > are private, and we have getters and setters to access and modify them. My > thought is to automate the process of creating such class files with command > line. > Even better than automating the generation of code is NOT generating the code. Just write what you need, instead of writing classes with classes and objects in them, and getters and setters. None of that is necessary (in ANY language), and it exists only because people perpetuate the idea that it's somehow necessary. I think you'll find that Python can be even easier to use than you imagined; start by not writing code, and then you'll find that you can not-debug the code you didn't write, and not-test that non-code, and so on. Fewer lines of code has an exponential cascading benefit in development effort! ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GHKEHUKF6U34FDYZG7SEVZIUMGJLO75/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating a class template file generator inside Python language
Are you aware of https://docs.python.org/3.10/library/dataclasses.html? That essentially provides a way to generate the boilerplate without having to put it all in your code. El mié, 16 feb 2022 a las 5:20, Avanish Gupta () escribió: > Thanks for the previous mail. > I would like to highlight the value my thought offers to the developers. > They often have to write classes while writing a module. In the class, they > are supposed to write classes and objects. In a class, typically the > attributes are private, and we have getters and setters to access and > modify them. My thought is to automate the process of creating such class > files with command line. > For this, I have made my initial attempt to create a module that does so. > Here is a short video in which I am demonstrating it. > I am also attaching the details of my terminal to get a bit of more > details. > > I am looking forward to hearing from you if I should work on integrating > it within the Python Language features, or I should go some other way > round, or this is not a very useful tool for many people. > > *avanishcodes@avanishcodes*:*~*$ echo "No Student.py file here" > No Student.py file here*avanishcodes@avanishcodes*:*~*$ ls*Academics* > *Documents* *node_modules* *Pictures* *snap**Videos**bin* >*Downloads* package.json *Projects* Student.py*Desktop**Music* > package-lock.json *Public* > *Templates**avanishcodes@avanishcodes*:*~*$ rm Student.py > *avanishcodes@avanishcodes*:*~*$ ls*Academics* *Documents* *node_modules* > *Pictures* *snap**bin**Downloads* package.json *Projects* > *Templates**Desktop**Music* package-lock.json *Public* > *Videos**avanishcodes@avanishcodes*:*~*$ pip install classgen > Requirement already satisfied: classgen in > ./.local/lib/python3.8/site-packages (0.0.7)*avanishcodes@avanishcodes*:*~*$ > python3 -m classgen Student member1,member2,member3,member4,member5,member6 > Class Student generated successfully.*avanishcodes@avanishcodes*:*~*$ cat > Student.py > #(class) Student > class Student: > """ > This class is used to represent a Student. > > Attributes: > member1: The member1 of the Student. > member2: The member2 of the Student. > member3: The member3 of the Student. > member4: The member4 of the Student. > member5: The member5 of the Student. > member6: The member6 of the Student. > > Methods: > get_member1(self): Gets the member1 of the Student. > get_member2(self): Gets the member2 of the Student. > get_member3(self): Gets the member3 of the Student. > get_member4(self): Gets the member4 of the Student. > get_member5(self): Gets the member5 of the Student. > get_member6(self): Gets the member6 of the Student. > set_member1(self, member1): Sets the member1 of the Student. > set_member2(self, member2): Sets the member2 of the Student. > set_member3(self, member3): Sets the member3 of the Student. > set_member4(self, member4): Sets the member4 of the Student. > set_member5(self, member5): Sets the member5 of the Student. > set_member6(self, member6): Sets the member6 of the Student. > """ > > _member1: None > _member2: None > _member3: None > _member4: None > _member5: None > _member6: None > > def __init__(self, member1, member2, member3, member4, member5, member6): > """ > Initializes a Student object. > > Params: > member1: The member1 of the Student. > member2: The member2 of the Student. > member3: The member3 of the Student. > member4: The member4 of the Student. > member5: The member5 of the Student. > member6: The member6 of the Student. > """ > self._member1 = member1 > self._member2 = member2 > self._member3 = member3 > self._member4 = member4 > self._member5 = member5 > self._member6 = member6 > > def get_member1(self): > """ > Gets the member1 of the Student. > > Returns: > get_member1: The member1 of the Student. > """ > return self._member1 > > def get_member2(self): > """ > Gets the member2 of the Student. > > Returns: > get_member2: The member2 of the Student. > """ > return self._member2 > > def get_member3(self): > """ > Gets the member3 of the Student. > > Returns: > get_member3: The member3 of the Student. > """ > return self._member3 > > def get_member4(self): > """ > Gets the member4 of the Student. > > Returns: > get_member4: The member4 of the Student. > """ > return self._member4 > > def get_member5(self): > """
[Python-ideas] Re: Creating ranges with ellipsis
On 2022-02-16 10:45, Steven D'Aprano wrote: > On Wed, Feb 16, 2022 at 09:44:07AM -0300, Soni L. wrote: > > This might be a silly idea but, would it be a good idea to have > > ...[a:b:c] return a range(a, b, c)? > > Similar ideas have been suggested before: > > https://mail.python.org/archives/list/python-ideas@python.org/thread/W44PPBJJXETTBQHWCMJB3DRCD6CTXWJT/ > > https://bugs.python.org/issue42956 > > What benefit do you see in writing [a:b:c] instead of range(a, b, c)? > > > *nod* we see. syntax constructs like these are mostly about taste. it's like being able to write generators in function calls like list(x for x in foo), but also having (x for x in foo) instead of using a gen(x for x in foo) function. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/G2ZYUPM5MARUMELUV4FDDHJS6O6H42TB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Regex pattern matching
Hi, I've been thinking that it would be nice if regex match objects could be deconstructed with pattern matching. For example, a simple .obj parser could use it like this: match re.match(r"(v|f) (\d+) (\d+) (\d+)", line): case ["v", x, y, z]: print("Handle vertex") case ["f", a, b, c]: print("Handle face") Sequence patterns would extract groups directly. Mapping patterns could be used to extract named groups, which would be nice for simple parsers/tokenizers: match re.match(r"(?P\d+)|(?P\+)|(?P\*)", line): case {"number": str(value)}: return Token(type="number", value=int(value)) case {"add": str()}: return Token(type="add") case {"mul": str()}: return Token(type="mul") Right now, match objects aren't proper sequence or mapping types though, but that doesn't seem too complicated to achieve. If this is something that enough people would consider useful I'm willing to look into how to implement this. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/EKMIJCSJGHJR36W2CNJE4CKO3S5MW3U4/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex pattern matching
On Wed, 16 Feb 2022 at 14:47, Valentin Berlier wrote: > > Hi, > > I've been thinking that it would be nice if regex match objects could be > deconstructed with pattern matching. For example, a simple .obj parser could > use it like this: > > match re.match(r"(v|f) (\d+) (\d+) (\d+)", line): > case ["v", x, y, z]: > print("Handle vertex") > case ["f", a, b, c]: > print("Handle face") > > Sequence patterns would extract groups directly. Mapping patterns could be > used to extract named groups, which would be nice for simple > parsers/tokenizers: > > match re.match(r"(?P\d+)|(?P\+)|(?P\*)", line): > case {"number": str(value)}: > return Token(type="number", value=int(value)) > case {"add": str()}: > return Token(type="add") > case {"mul": str()}: > return Token(type="mul") > > Right now, match objects aren't proper sequence or mapping types though, but > that doesn't seem too complicated to achieve. If this is something that > enough people would consider useful I'm willing to look into how to implement > this. I'm not sure I really see the benefit of this, but if you want to do it, couldn't you just write a wrapper? >>> class MatchAsSeq(Sequence): ... def __getattr__(self, attr): ... return getattr(self.m, attr) ... def __len__(self): ... return len(self.m.groups()) ... def __init__(self, m): ... self.m = m ... def __getitem__(self, n): ... return self.group(n+1) ... >>> line = "v 1 12 3" >>> match MatchAsSeq(re.match(r"(v|f) (\d+) (\d+) (\d+)", line)): ... case ["v", x, y, z]: ... print("Handle vertex") ... case ["f", a, b, c]: ... print("Handle face") ... Handle vertex Paul ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LCWHARNW5OOCY7CHCXC5CVGFH4OAFOEW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex pattern matching
See https://bugs.python.org/issue46692. It's not so easy to make match objects mappings or sequences because of the len() problem. Eric On 2/16/2022 9:46 AM, Valentin Berlier wrote: Hi, I've been thinking that it would be nice if regex match objects could be deconstructed with pattern matching. For example, a simple .obj parser could use it like this: match re.match(r"(v|f) (\d+) (\d+) (\d+)", line): case ["v", x, y, z]: print("Handle vertex") case ["f", a, b, c]: print("Handle face") Sequence patterns would extract groups directly. Mapping patterns could be used to extract named groups, which would be nice for simple parsers/tokenizers: match re.match(r"(?P\d+)|(?P\+)|(?P\*)", line): case {"number": str(value)}: return Token(type="number", value=int(value)) case {"add": str()}: return Token(type="add") case {"mul": str()}: return Token(type="mul") Right now, match objects aren't proper sequence or mapping types though, but that doesn't seem too complicated to achieve. If this is something that enough people would consider useful I'm willing to look into how to implement this. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/EKMIJCSJGHJR36W2CNJE4CKO3S5MW3U4/ Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BA6TGJJN65246H7MWYLTUGFSEJ2U2KQ7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating ranges with ellipsis
16.02.22 14:44, Soni L. пише: > This might be a silly idea but, would it be a good idea to have > ...[a:b:c] return a range(a, b, c)? See PEP 204. https://www.python.org/dev/peps/pep-0204/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KOJRH7UHMQEMZN4IPB6TT7NXOQNZAFVT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating ranges with ellipsis
> This might be a silly idea but, would it be a good idea to have > ...[a:b:c] return a range(a, b, c)? This sort of highly-subjective syntactic sugar makes me wonder whether there would be support for a standard python preprocessor, like what was suggested in PEP 638 [1]. [1]: https://www.python.org/dev/peps/pep-0638/ - DLD ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CY2YRG6TCAT5GXB7EKCCCHSKQPS6ZSQE/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
Well, I certainly sparked a lot of interesting discussion, which I have quite enjoyed reading. But to bring this thread back around to its original topic, is there support among the Python maintainers for adding a timeout feature to the Python re library? I will look at the third-party regex library that Jonathan suggested but I still believe a timeout option would be a valuable feature to have in the standard library. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QFGRFFRP7OO6UAOBVRVVGLVQOFMQF64B/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
On Thu, 17 Feb 2022 at 08:33, J.B. Langston wrote: > > Well, I certainly sparked a lot of interesting discussion, which I have quite > enjoyed reading. But to bring this thread back around to its original topic, > is there support among the Python maintainers for adding a timeout feature to > the Python re library? I will look at the third-party regex library that > Jonathan suggested but I still believe a timeout option would be a valuable > feature to have in the standard library. > I'm not a maintainer, but I'd personally be against a timeout. It would add overhead to common cases in order to put a shield around pathological ones, and it's difficult to impossible to usefully define the cutoff. Instead, I'd recommend trying some of the simpler parsing options, as explored in the ensuing discussion, to see if one of those has better worst-case performance while still being able to do what's needed. (Hence all the discussion of "no-backtracking" options.) ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VTYADNIBL634JSAY4465DWRW3N5T6KMU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
[J.B. Langston ] > Well, I certainly sparked a lot of interesting discussion, which I have > quite enjoyed reading. But to bring this thread back around to its > original topic, is there support among the Python maintainers for > adding a timeout feature to the Python re library? Buried in the fun discussion was my guess: no way. Python's re is effectively dead legacy code, with no current "owner". Its commit history shows very little activity for some years already. Mos\ commits are due to generic "cod\e cleanup" crusades that have nothing specific to do with the algorithms. None required non-triv\ial knowledge of the implementation. Here's the most recent I found that actually changed behavior: """ commit 6cc8ac949907b9a1c0f73709c6978b7a43e634e3 Author: Zackery Spytz Date: Fri May 21 14:02:42 2021 -0700 bpo-40736: Improve the error message for re.search() TypeError (GH-23312) Include the invalid type in the error message. A trivial change. > I will look at the third-party regex library that Jonathan suggested but > I still believe a timeout option would be a valuable feature to have > in the standard library. Which is the problem: regex has _dozens_ of features that would be valuable to have in the standard library. reg\ex is in fact one of the best regexp libraries on the planet. It already has timeouts, and other features (like possessive quantifiers) that are actually (unlike timeouts) frequently asked about by many programmers. In fact regex started life intending to go into core Python, in 2008: https://bugs.python.org/issue3825 That stretched on and on, and the endless bikeshedding eventually appeared to fizzle out in 2014: https://bugs.python.org/issue2636 In 2021 a core dev eventually rejected it, as by then MRAB had long since released it as a successful extension module. I assume - but don't know - he got burned out by "the endless bikeshedding" on those issue reports. In any cose, no, no core dev I know of is going to devote their limited time to reproducing a tiny subset of regex's many improvements in Python's legacy engine. In fact, "install regex!" is such an obvious choice at this point that I wouldn't even give time to just reviewing a patch that added timeouts. BTW, I didn't mention regex in your BPO report because I didn't know at the time it already implemented timeouts. I learned that in this thread. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2EOIYQCKIWD57SC4IDYNCSSB65LDGPIU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
On 2022-02-16 22:13, Tim Peters wrote: [J.B. Langston ] Well, I certainly sparked a lot of interesting discussion, which I have quite enjoyed reading. But to bring this thread back around to its original topic, is there support among the Python maintainers for adding a timeout feature to the Python re library? Buried in the fun discussion was my guess: no way. Python's re is effectively dead legacy code, with no current "owner". Its commit history shows very little activity for some years already. Mos\ commits are due to generic "cod\e cleanup" crusades that have nothing specific to do with the algorithms. None required non-triv\ial knowledge of the implementation. Here's the most recent I found that actually changed behavior: """ commit 6cc8ac949907b9a1c0f73709c6978b7a43e634e3 Author: Zackery Spytz Date: Fri May 21 14:02:42 2021 -0700 bpo-40736: Improve the error message for re.search() TypeError (GH-23312) Include the invalid type in the error message. A trivial change. I will look at the third-party regex library that Jonathan suggested but I still believe a timeout option would be a valuable feature to have in the standard library. Which is the problem: regex has _dozens_ of features that would be valuable to have in the standard library. reg\ex is in fact one of the best regexp libraries on the planet. It already has timeouts, and other features (like possessive quantifiers) that are actually (unlike timeouts) frequently asked about by many programmers. In fact regex started life intending to go into core Python, in 2008: https://bugs.python.org/issue3825 That stretched on and on, and the endless bikeshedding eventually appeared to fizzle out in 2014: https://bugs.python.org/issue2636 In 2021 a core dev eventually rejected it, as by then MRAB had long since released it as a successful extension module. I assume - but don't know - he got burned out by "the endless bikeshedding" on those issue reports. I eventually decided against having it added to the standard library because that would tie fixes and additions to Python's release cycle, and there's that adage that Python has "batteries included", but not nuclear reactors. PyPI is a better place for it, for those who need more than what the standard re module provides. In any cose, no, no core dev I know of is going to devote their limited time to reproducing a tiny subset of regex's many improvements in Python's legacy engine. In fact, "install regex!" is such an obvious choice at this point that I wouldn't even give time to just reviewing a patch that added timeouts. BTW, I didn't mention regex in your BPO report because I didn't know at the time it already implemented timeouts. I learned that in this thread. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5K3XWIY7YK4RUMIZGYWNETB3N74PTLPZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
[MRAB ] > I eventually decided against having it added to the standard library > because that would tie fixes and additions to Python's release cycle, > and there's that adage that Python has "batteries included", but not > nuclear reactors. PyPI is a better place for it, for those who need more > than what the standard re module provides. I think it's a puzzle with no good solution. For example, while atomic groups may have been "nuclear reactor" level in 2008, they're ubiquitous now. regexps in industry practice keep evolving, and so does your regex module, but Python's re module is frozen in an ever-receding past. Nobody wants to work on it because, well, "regex already does that! In fact, it's been doing it for 15 years already". Your module is _too_ successful for Python's good ;-) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/B5MY3IFRELONLC535C5QCEVDEGPUPQRJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
Thanks for the conclusive answer. I will checkout the regex library soon. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5GWWTBBBJEV4FZBHLHIAL4QYGUOCB6TM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Creating ranges with ellipsis
> > > This might be a silly idea but, would it be a good idea to have > > ...[a:b:c] return a range(a, b, c)? > If a 'thunderscore' is acceptable: import itertools class _ranger: @classmethod def __getitem__(self, key: slice): if isinstance(key, slice): if key.stop is None: return itertools.count(key.start, key.step or 1) return range(key.start, key.stop, key.step or 1) return range(key) ___ = _ranger() Trying to write it brings out lots of questions like what would [:y] do, or [:], [::z], etc. Only [x], [x:y], [x:], [x::z], [x:y:z] seem to make sense. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WARMRS7GMQHYEBGR5FTBKHW436DWRWW6/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
[J.B. Langston ] > Thanks for the conclusive answer. Not conclusive - just my opinion. Which is informed, but not infallible ;-) > I will checkout the regex library soon. You may not realize how easy this is? Just in case: go to a shell and type pip install regex (or, on Windows, "python -m pip install regex" in a DOS box). That's it. You're done. Now you can use regex. In some cases, you can put "import regex as re" at the top of a module At worst, replace instances of "re" with "regex". Stay away from the new features, and it's highly compatible with Python;s re. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C35D2Z5GZSWN3T46OR5LECU7VG5YD3LQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Regex timeouts
> On 17 Feb 2022, at 01:04, Tim Peters wrote: > > [J.B. Langston ] >> Thanks for the conclusive answer. > > Not conclusive - just my opinion. Which is informed, but not infallible ;-) > >> I will checkout the regex library soon. > > You may not realize how easy this is? Just in case: go to a shell and type > > pip install regex > > (or, on Windows, "python -m pip install regex" in a DOS box). > > That's it. You're done. Now you can use regex. In some cases, you can > put "import regex as re" at the top of a module At worst, replace > instances of "re" with "regex". Stay away from the new features, and > it's highly compatible with Python;s re. I suspect that like me what was meant is that checkout means read the docs to understand regex features. The install is trivial. Barry > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/C35D2Z5GZSWN3T46OR5LECU7VG5YD3LQ/ > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KLSG7EECZ3UWH4KJ7PAAOEI37XWTKUNP/ Code of Conduct: http://python.org/psf/codeofconduct/