Re: Python's regular expression help
On Apr 29, 11:49 am, Tim Chase wrote: > On 04/29/2010 01:00 PM, goldtech wrote: > > > Trying to start out with simple things but apparently there's some > > basics I need help with. This works OK: > import re > p = re.compile('(ab*)(sss)') > m = p.match( 'absss' ) > > f=r'abss' > f > > 'abss' > m = p.match( f ) > m.group(0) > > Traceback (most recent call last): > > File "", line 1, in > > m.group(0) > > AttributeError: 'NoneType' object has no attribute 'group' > > 'absss' != 'abss' > > Your regexp looks for 3 "s", your "f" contains only 2. So the > regexp object doesn't, well, match. Try > > f = 'absss' > > and it will work. As an aside, using raw-strings for this text > doesn't change anything, but if you want, you _can_ write it as > > f = r'absss' > > if it will make you feel better :) > > > How do I implement a regex on a multiline string? I thought this > > might work but there's problem: > > p = re.compile('(ab*)(sss)', re.S) > m = p.match( 'ab\nsss' ) > m.group(0) > > Traceback (most recent call last): > > File "", line 1, in > > m.group(0) > > AttributeError: 'NoneType' object has no attribute 'group' > > Well, it depends on what you want to do -- regexps are fairly > precise, so if you want to allow whitespace between the two, you > can use > > r = re.compile(r'(ab*)\s*(sss)') > > If you want to allow whitespace anywhere, it gets uglier, and > your capture/group results will contain that whitespace: > > r'(a\s*b*)\s*(s\s*s\s*s)' > > Alternatively, if you don't want to allow arbitrary whitespace > but only newlines, you can use "\n*" instead of "\s*" > > -tkc Yes, most of my problem is w/my patterns not w/any python re syntax. I thought re.S will take a multiline string with any spaces or newlines and make it appear as one line to the regex. Make "/n" be ignored in a way...still playing w/it. Thanks for the help! -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
On 04/29/2010 01:00 PM, goldtech wrote: Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' 'absss' != 'abss' Your regexp looks for 3 "s", your "f" contains only 2. So the regexp object doesn't, well, match. Try f = 'absss' and it will work. As an aside, using raw-strings for this text doesn't change anything, but if you want, you _can_ write it as f = r'absss' if it will make you feel better :) How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Well, it depends on what you want to do -- regexps are fairly precise, so if you want to allow whitespace between the two, you can use r = re.compile(r'(ab*)\s*(sss)') If you want to allow whitespace anywhere, it gets uglier, and your capture/group results will contain that whitespace: r'(a\s*b*)\s*(s\s*s\s*s)' Alternatively, if you don't want to allow arbitrary whitespace but only newlines, you can use "\n*" instead of "\s*" -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
goldtech wrote: Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex on a string variable? I'm doing something wrong here: f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Look closely: the regex contains 3 letter 's', but the string referred to by f has only 2. How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Thanks for the newbie regex help, Lee The string contains a newline between the 'b' and the 's', but the regex isn't expecting any newline (or any other character) between the 'b' and the 's', hence no match. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression help
Le 29/04/2010 20:00, goldtech a écrit : Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: import re p = re.compile('(ab*)(sss)') m = p.match( 'absss' ) m.group(0) 'absss' m.group(1) 'ab' m.group(2) 'sss' ... But two questions: How can I operate a regex on a string variable? I'm doing something wrong here: f=r'abss' f 'abss' m = p.match( f ) m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' How do I implement a regex on a multiline string? I thought this might work but there's problem: p = re.compile('(ab*)(sss)', re.S) m = p.match( 'ab\nsss' ) m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' Thanks for the newbie regex help, Lee for multiline, I use re.DOTALL I do not know match(), findall is pretty efficient : my = "LINK" res = re.findall(">(.*?)<",my) >>> res ['LINK'] Dorian -- http://mail.python.org/mailman/listinfo/python-list
Python's regular expression help
Hi, Trying to start out with simple things but apparently there's some basics I need help with. This works OK: >>> import re >>> p = re.compile('(ab*)(sss)') >>> m = p.match( 'absss' ) >>> m.group(0) 'absss' >>> m.group(1) 'ab' >>> m.group(2) 'sss' ... But two questions: How can I operate a regex on a string variable? I'm doing something wrong here: >>> f=r'abss' >>> f 'abss' >>> m = p.match( f ) >>> m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' How do I implement a regex on a multiline string? I thought this might work but there's problem: >>> p = re.compile('(ab*)(sss)', re.S) >>> m = p.match( 'ab\nsss' ) >>> m.group(0) Traceback (most recent call last): File "", line 1, in m.group(0) AttributeError: 'NoneType' object has no attribute 'group' >>> Thanks for the newbie regex help, Lee -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug in Python's regular expression engine?
On Nov 27, 10:52 am, MonkeeSage <[EMAIL PROTECTED]> wrote: > On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality" > > <[EMAIL PROTECTED]> wrote: > > That is funny. Thank you for your help... > > Just for clarification, what does the "r" in your code do? > > It means a "raw" string (as you know ruby, think of it like %w{}): > > This page explains about string literal prefixes (see especially the > end-notes): > > http://docs.python.org/ref/strings.html > > HTH, > Jordan Arg! %w{} should have said %q{} -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug in Python's regular expression engine?
On Nov 27, 10:19 am, "Just Another Victim of the Ambient Morality" <[EMAIL PROTECTED]> wrote: > That is funny. Thank you for your help... > Just for clarification, what does the "r" in your code do? It means a "raw" string (as you know ruby, think of it like %w{}): This page explains about string literal prefixes (see especially the end-notes): http://docs.python.org/ref/strings.html HTH, Jordan -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug in Python's regular expression engine?
"Paul Hankin" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality" > <[EMAIL PROTECTED]> wrote: >> This won't compile for me: >> >> regex = re.compile('(.*\\).*') >> >> I get the error: >> >> sre_constants.error: unbalanced parenthesis >> >> I'm running Python 2.5 on WinXP. I've tried this expression with >> another RE engine in another language and it works just fine which leads >> me >> to believe the problem is Python. Can anyone confirm or deny this bug? > > Your code is equivalent to: > regex = re.compile(r'(.*\).*') > > Written like this, it's easier to see that you've started a regular > expression group with '(', but it's never closed since your closed > parenthesis is escaped (which causes it to match a literal ')' when > used). Hence the reported error (which isn't a bug). > > Perhaps you meant this? > regex = re.compile(r'(.*\\).*') > > This matches any number of characters followed by a backslash (group > 1), and then any number of characters. If you're using this for path > splitting filenames under Windows, you should look at os.path.split > instead of writing your own. Indeed, I did end up using os.path functions, instead. I think I see what's going on. Backslash has special meaning in both the regular expression and Python string declarations. So, my version should have been something like this: regex = re.compile('(.*).*') That is funny. Thank you for your help... Just for clarification, what does the "r" in your code do? -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug in Python's regular expression engine?
On Nov 27, 3:48 pm, "Just Another Victim of the Ambient Morality" <[EMAIL PROTECTED]> wrote: > This won't compile for me: > > regex = re.compile('(.*\\).*') > > I get the error: > > sre_constants.error: unbalanced parenthesis > > I'm running Python 2.5 on WinXP. I've tried this expression with > another RE engine in another language and it works just fine which leads me > to believe the problem is Python. Can anyone confirm or deny this bug? Your code is equivalent to: regex = re.compile(r'(.*\).*') Written like this, it's easier to see that you've started a regular expression group with '(', but it's never closed since your closed parenthesis is escaped (which causes it to match a literal ')' when used). Hence the reported error (which isn't a bug). Perhaps you meant this? regex = re.compile(r'(.*\\).*') This matches any number of characters followed by a backslash (group 1), and then any number of characters. If you're using this for path splitting filenames under Windows, you should look at os.path.split instead of writing your own. HTH -- Paul Hankin -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug in Python's regular expression engine?
On 2007-11-27, Just Another Victim of the Ambient Morality <[EMAIL PROTECTED]> wrote: > This won't compile for me: > > > regex = re.compile('(.*\\).*') > > I get the error: > sre_constants.error: unbalanced parenthesis Hint 1: Always assume that errors are in your own code. Blaming library code and language implementations will get you nowhere most of the time. Hint 2: regular expressions and Python strings use the same escape character. Hint 3: Consult the Python documentation about raw strings, and what they are meant for. -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list
Re: A bug in Python's regular expression engine?
Just Another Victim of the Ambient Morality wrote: > This won't compile for me: > > > regex = re.compile('(.*\\).*') > > > I get the error: > > > sre_constants.error: unbalanced parenthesis > > > I'm running Python 2.5 on WinXP. I've tried this expression with > another RE engine in another language and it works just fine which leads > me > to believe the problem is Python. Can anyone confirm or deny this bug? It pretty much says what the problem is - you escaped the closing parenthesis, resulting in an invalid rex. Either use raw-strings or put the proper amount of backslashes in your string: regex = re.compile(r'(.*\\).*') # raw string literal regex = re.compile('(.*).*') # two consecutive \es, meaning an escaped one Diez -- http://mail.python.org/mailman/listinfo/python-list
A bug in Python's regular expression engine?
This won't compile for me: regex = re.compile('(.*\\).*') I get the error: sre_constants.error: unbalanced parenthesis I'm running Python 2.5 on WinXP. I've tried this expression with another RE engine in another language and it works just fine which leads me to believe the problem is Python. Can anyone confirm or deny this bug? Thank you... -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Dave Hansen wrote: > On Wed, 10 May 2006 06:44:27 GMT in comp.lang.python, Edward Elliott > <[EMAIL PROTECTED]> wrote: > > > > > > Would I recommend perl for readable, maintainable code? No, not > > when better options like Python are available. But it can be done > > with some effort. > > I'm reminded of a comment made a few years ago by John Levine, > moderator of comp.compilers. He said something like "It's clearly > possible to write good code in C++. It's just that no one does." Reminds me of the quote that used to appear on the front page of the ViewCVS project (seems to have gone now that they've moved and renamed themselves to ViewVC). Can't recall the attribution off the top of my head: "[Perl] combines the power of C with the readability of PostScript" Scathing ... but very funny :-) Dave. -- -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
On Wed, 10 May 2006 06:44:27 GMT in comp.lang.python, Edward Elliott <[EMAIL PROTECTED]> wrote: > >Would I recommend perl for readable, maintainable code? No, not when better >options like Python are available. But it can be done with some effort. I'm reminded of a comment made a few years ago by John Levine, moderator of comp.compilers. He said something like "It's clearly possible to write good code in C++. It's just that no one does." Regards, -=Dave -- Change is inevitable, progress is not. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
bruno at modulix wrote: > From a readability/maintenance POV, Perl is a perfect nightmare. It's certainly true that perl lacks the the eminently readable quality of python. But then so do C, C++, Java, and a lot of other languages. And I'll grant you that perl is more susceptible to the 'executable line-noise' style than most other languages. This results from its heritage as a quick-and-dirty awk/sed type text processing language. But perl doesn't *have* to look that way, and not every perl program is a 'perfect nightmare'. If you follow good practices like turning on strict checking, using readable variable names, avoiding $_, etc, you can produce pretty readable and maintainable code. It takes some discipline, but it's very doable. I've worked with some perl programs for over 5 years without any trouble. About the only thing you can't avoid are the sigils everywhere. Would I recommend perl for readable, maintainable code? No, not when better options like Python are available. But it can be done with some effort. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Mirco Wahab wrote: > If you wouldn't need dictionary lookup and > get away with associated categories, all > you'd have to do would be this: > >$PATTERN = qr/ >(blue |white |red)(?{'Colour'}) > | (socks|tights)(?{'Garment'}) > | (boot |shoe |trainer)(?{'Footwear'}) >/x; > >$t = 'blue socks and red shoes'; >print "$^R: $^N\n" while( $t=~/$PATTERN/g ); > > What's the point of all that? IMHO, Python's > Regex support is quite good and useful, but > won't give you an edge over Perl's in the end. If you are desperate to collapse the code down to a single print statement you can do that easily in Python as well: >>> PATTERN = ''' (?Pblue |white |red) | (?Psocks|tights) | (?Pboot |shoe |trainer) ''' >>> t = 'blue socks and red shoes' >>> print '\n'.join("%s:%s" % (match.lastgroup, match.group(match.lastgroup)) for match in re.finditer(PATTERN, t, re.VERBOSE)) Colour:blue Garment:socks Colour:red Footwear:shoe -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Davy wrote: > Hi all, > (snip) > Does Python support robust regular expression like Perl? Yes. > And Python and Perl's File content manipulation, which is better? >From a raw perf and write-only POV, Perl clearly beats Python (regarding I/O, Perl is faster than C - or it least it was the last time I benched it on a Linux box). >From a readability/maintenance POV, Perl is a perfect nightmare. > Any suggestions will be appreciated! http://pythonology.org/success&story=esr -- bruno desthuilliers python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for p in '[EMAIL PROTECTED]'.split('@')])" -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi Duncan > Nick Craig-Wood wrote: >> Which translates to >> match = re.search('(blue|white|red)', t) >> if match: >> else: >> if match: >> else: >> if match: > > This of course gives priority to colours and only looks for garments or > footwear if the it hasn't matched on a prior pattern. If you actually > wanted to match the first occurrence of any of these (or if the condition > was re.match instead of re.search) then named groups can be a nice way of > simplifying the code: A good point. And a good example when to use named capture group references. This is easily extended for 'spitting out' all other occuring categories (see below). > PATTERN = ''' > (?Pblue|white|red) > ... This is one nice thing in Pythons Regex Syntax, you have to emulate the ?P-thing in other Regex-Systems more or less 'awk'-wardly ;-) > For something this simple the titles and group names could be the > same, but I'm assuming real code might need a bit more. Non no, this is quite good because it involves some math-generated table-code lookup. I managed somehow to extend your example in order to spit out all matches and their corresponding category: import re PATTERN = ''' (?Pblue |white |red) | (?Psocks|tights) | (?Pboot |shoe |trainer) ''' PATTERN = re.compile(PATTERN , re.VERBOSE) TITLES = { 'c': 'Colour', 'g': 'Garment', 'f': 'Footwear' } t = 'blue socks and red shoes' for match in PATTERN.finditer(t): grp = match.lastgroup print "%s: %s" %( TITLES[grp], match.group(grp) ) which writes out the expected: Colour: blue Garment: socks Colour: red Footwear: shoe The corresponding Perl-program would look like this: $PATTERN = qr/ (blue |white |red)(?{'c'}) | (socks|tights)(?{'g'}) | (boot |shoe |trainer)(?{'f'}) /x; %TITLES = (c =>'Colour', g =>'Garment', f =>'Footwear'); $t = 'blue socks and red shoes'; print "$TITLES{$^R}: $^N\n" while( $t=~/$PATTERN/g ); and prints the same: Colour: blue Garment: socks Colour: red Footwear: shoe You don't have nice named match references (?P<..>) in Perl-5, so you have to emulate this by an ordinary code assertion (?{..}) an set some value ($^R) on the fly - which is not that bad in the end (imho). (?{..}) means "zero with code assertion", this sets Perl-predefined $^R to its evaluated value from the {...} As you can see, the pattern matching related part reduces from 4 lines to one line. If you wouldn't need dictionary lookup and get away with associated categories, all you'd have to do would be this: $PATTERN = qr/ (blue |white |red)(?{'Colour'}) | (socks|tights)(?{'Garment'}) | (boot |shoe |trainer)(?{'Footwear'}) /x; $t = 'blue socks and red shoes'; print "$^R: $^N\n" while( $t=~/$PATTERN/g ); What's the point of all that? IMHO, Python's Regex support is quite good and useful, but won't give you an edge over Perl's in the end. Thanks & Regards Mirco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Nick Craig-Wood wrote: > Which translates to > > match = re.search('(blue|white|red)', t) > if match: > print "Colour:", match.group(1) > else: > match = re.search('(socks|tights)', t) > if match: > print "Garment:", match.group(1) > else: > match = re.search('(boot|shoe|trainer)', t) > if match: >print "Footwear:", match.group(1) ># indented ad infinitum! This of course gives priority to colours and only looks for garments or footwear if the it hasn't matched on a prior pattern. If you actually wanted to match the first occurrence of any of these (or if the condition was re.match instead of re.search) then named groups can be a nice way of simplifying the code: PATTERN = ''' (?Pblue|white|red) | (?Psocks|tights) | (?Pboot|shoe|trainer) ''' PATTERN = re.compile(PATTERN, re.VERBOSE) TITLES = { 'c': 'Colour', 'g': 'Garment', 'f': 'Footwear' } match = PATTERN.search(t) if match: grp = match.lastgroup print "%s: %s" % (TITLES[grp], match.group(grp)) For something this simple the titles and group names could be the same, but I'm assuming real code might need a bit more. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Mirco Wahab <[EMAIL PROTECTED]> wrote: > After some minutes in this NG I start to get > the picture. So I narrowed the above regex-question > down to a nice equivalence between Perl and Python: > > Python: > > import re > > t = 'blue socks and red shoes' > if re.match('blue|white|red', t): > print t > > t = 'blue socks and red shoes' > if re.search('blue|white|red', t): >print t > > Perl: > > use Acme::Pythonic; > > $t = 'blue socks and red shoes' > if $t =~ /blue|white|red/: > print $t > > And Python Regexes eventually lost (for me) some of > their (what I believed) 'clunky appearance' ;-) If you are used to perl regexes there is one clunkiness of python regexpes which you'll notice eventually... Let's make the above example a bit more "real world", ie use the matched item in some way... Perl: $t = 'blue socks and red shoes'; if ( $t =~ /(blue|white|red)/ ) { print "Colour: $1\n"; } Which prints Colour: blue In python you have to express this like import re t = 'blue socks and red shoes' match = re.search('(blue|white|red)', t) if match: print "Colour:", match.group(1) Note the extra variable "match". You can't do assignment in an expression in python which makes for the extra verbiosity, and you need a variable to store the result of the match in (since python doesn't have the magic $1..$9 variables). This becomes particularly frustrating when you have to do a series of regexp matches, eg if ( $t =~ /(blue|white|red)/ ) { print "Colour: $1\n"; } elsif ( $t =~ /(socks|tights)/) { print "Garment: $1\n"; } elsif ( $t =~ /(boot|shoe|trainer)/) { print "Footwear: $1\n"; } Which translates to match = re.search('(blue|white|red)', t) if match: print "Colour:", match.group(1) else: match = re.search('(socks|tights)', t) if match: print "Garment:", match.group(1) else: match = re.search('(boot|shoe|trainer)', t) if match: print "Footwear:", match.group(1) # indented ad infinitum! You can use a helper class to get over this frustration like this import re class Matcher: def search(self, r,s): self.value = re.search(r,s) return self.value def __getitem__(self, i): return self.value.group(i) m = Matcher() t = 'blue socks and red shoes' if m.search(r'(blue|white|red)', t): print "Colour:", m[1] elif m.search(r'(socks|tights)', t): print "Garment:", m[1] elif m.search(r'(boot|shoe|trainer)', t): print "Footwear:", m[1] Having made the transition from perl to python a couple of years ago, I find myself using regexpes much less. In perl everything looks like it needs a regexp, but python has a much richer set of string methods, eg .startswith, .endswith, good subscripting and the nice "in" operator for strings. -- Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi John >> But what would be an appropriate use >> of search() vs. match()? When to use what? > > ReadTheFantasticManual :-) >From the manual you mentioned, i don't get the point of 'match'. So why should you use an extra function entry match(), re.match('whatever', t): which is, according to the FM, equivalent to (a special case of?) re.search('^whatever', t): For me, it looks like match() should be used on simple string comparisons like a 'ramped up C-strcmp()'. Or isn't ist? Maybe I dont get it ;-) Thanks Mirco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
On 8/05/2006 11:13 PM, Mirco Wahab wrote: > Hi John > >>>import re >>> >>>t = 'blue socks and red shoes' >>>p = re.compile('(blue|white|red)') >>>if p.match(t): >> What do you expect when t == "green socks and red shoes"? Is it possible >> that you mean to use search() rather than match()? > > This is interesting. > What's in this example the difference then between: I suggest that you (a) read the description on the difference between search and match in the manual (b) try out search and match on both your original string and the one I proposed. > >import re > >t = 'blue socks and red shoes' >if re.compile('blue|white|red').match(t): > print t > > and > >t = 'blue socks and red shoes' >if re.search('blue|white|red', t): > print t [snip] > > But what would be an appropriate use > of search() vs. match()? When to use what? ReadTheFantasticManual :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi Duncan > There is no need to compile the regular expression in advance in Python > either: > ... > The only advantage to compiling in advance is a small speed up, and most of > the time that won't be significant. I read 'some' introductions into Python Regexes and got confused in the first place when to use what and why. After some minutes in this NG I start to get the picture. So I narrowed the above regex-question down to a nice equivalence between Perl and Python: Python: import re t = 'blue socks and red shoes' if re.match('blue|white|red', t): print t t = 'blue socks and red shoes' if re.search('blue|white|red', t): print t Perl: use Acme::Pythonic; $t = 'blue socks and red shoes' if $t =~ /blue|white|red/: print $t And Python Regexes eventually lost (for me) some of their (what I believed) 'clunky appearance' ;-) Thanks Mirco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi John >>import re >> >>t = 'blue socks and red shoes' >>p = re.compile('(blue|white|red)') >>if p.match(t): > > What do you expect when t == "green socks and red shoes"? Is it possible > that you mean to use search() rather than match()? This is interesting. What's in this example the difference then between: import re t = 'blue socks and red shoes' if re.compile('blue|white|red').match(t): print t and t = 'blue socks and red shoes' if re.search('blue|white|red', t): print t > There is no need to compile the regex in advance in Python, either. > Please consider the module-level function search() ... > if re.search(r"blue|white|red", t): > # also, no need for () in the regex. Thats true. Thank you for pointing this out. But what would be an appropriate use of search() vs. match()? When to use what? I answered the posting in the first place because also I'm coming from a C/C++/Perl background and trying to get along in Python. Thanks, Mirco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Mirco Wahab wrote: > Lets see - a really simple find/match > would look like this in Python: > >import re > >t = 'blue socks and red shoes' >p = re.compile('(blue|white|red)') >if p.match(t): > print t > > which prints the text 't' because of > the positive pattern match. > > In Perl, you write: > >use Acme::Pythonic; > >$t = 'blue socks and red shoes' >if ($t =~ /(blue|white|red)/): > print $t > > which is one line shorter (no need > to compile the regular expression > in advance). > There is no need to compile the regular expression in advance in Python either: t = 'blue socks and red shoes' if re.match('(blue|white|red)', t): print t The only advantage to compiling in advance is a small speed up, and most of the time that won't be significant. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
On 8/05/2006 10:31 PM, Mirco Wahab wrote: [snip] > > Lets see - a really simple find/match > would look like this in Python: > >import re > >t = 'blue socks and red shoes' >p = re.compile('(blue|white|red)') >if p.match(t): What do you expect when t == "green socks and red shoes"? Is it possible that you mean to use search() rather than match()? > print t > > which prints the text 't' because of > the positive pattern match. > > In Perl, you write: > >use Acme::Pythonic; > >$t = 'blue socks and red shoes' >if ($t =~ /(blue|white|red)/): > print $t > > which is one line shorter (no need > to compile the regular expression > in advance). There is no need to compile the regex in advance in Python, either. Please consider the module-level function search() ... if re.search(r"blue|white|red", t): # also, no need for () in the regex. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi Davy > > More similar than Perl ;-) But C has { }'s everywhere, so has Perl ;-) > > And what's 'integrated' mean (must include some library)? Yes. In Python, regular expressions are just another function library - you use them like in Java or C. In Perl, it's part of the core language, you use the awk-style (eg: /.../) regular expressions everywhere you want. If you used regexp in C/C++ before, you can use them in almost the same way in Python - which may give you an easy start. BTW. Python has some fine extensions to the perl(5)-Regexes, e.g. 'named backreferences'. But you won't see much regular expressions in Python code posted to this group, maybe because it looks clunky - which is unpythonic ;-) Lets see - a really simple find/match would look like this in Python: import re t = 'blue socks and red shoes' p = re.compile('(blue|white|red)') if p.match(t): print t which prints the text 't' because of the positive pattern match. In Perl, you write: use Acme::Pythonic; $t = 'blue socks and red shoes' if ($t =~ /(blue|white|red)/): print $t which is one line shorter (no need to compile the regular expression in advance). > > I like C++ file I/O, is it 'low' or 'high'? C++ has afaik actually three levels of I/O: (1) - (from C, very low) operating system level, included by which provides direct access to operating system services (read(), write(), lseek() etc.) (2) - C-Standard-Library buffered IO, included by , provides structured 'mid-level' access like (block-) fread()/ fwrite(), line read (fgets()) and formatted I/O (fprintf()/ fscanf()) (3) - C++/streams library (high level, , , ), which abstracts out the i/o devices, provides the same set of functionality for any abstract input or output. Perl provides all three levels of I/O, the 'abstracting' is introduced by modules which tie 'handle variables' to anything that may receive or send data. Python also does a good job on all three levels, but provides the (low level) operating system I/O by external modules (afaik). I didn't do much I/O in Python, so I can't say much here. Regards Mirco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
By the way, is there any tutorial talk about how to use the Python Shell (IDE). I wish it simple like VC++ :) Regards, Davy -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi Mirco, Thank you! More similar than Perl ;-) And what's 'integrated' mean (must include some library)? I like C++ file I/O, is it 'low' or 'high'? Regards, Davy -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
Hi Davy wrote: > I am a C/C++/Perl user and want to switch to Python OK > (I found Python is more similar to C). ;-) More similar than what? > Does Python support robust regular expression like Perl? It supports them fairly good, but it's not 'integrated' - at least it feels not integrated for me ;-) If you did a lot of Perl, you know what 'integrated' means ... > And Python and Perl's File content manipulation, which is better? What is a 'file content manipulation'? Did you mean 'good xxx level file IO', where xxx means either 'low' or 'high'? > Any suggestions will be appreciated! Just try to start a small project in Python - from source that you already have in C or Perl or something. Regards Mirco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python's regular expression?
"Davy" <[EMAIL PROTECTED]> writes: > Does Python support robust regular expression like Perl? Yep, Python regular expression is robust. Have a look at the Regex Howto: http://www.amk.ca/python/howto/regex/ and the re module: http://docs.python.org/lib/module-re.html -- Lawrence - http://www.oluyede.org/blog "Nothing is more dangerous than an idea if it's the only one you have" - E. A. Chartier -- http://mail.python.org/mailman/listinfo/python-list
Python's regular expression?
Hi all, I am a C/C++/Perl user and want to switch to Python (I found Python is more similar to C). Does Python support robust regular expression like Perl? And Python and Perl's File content manipulation, which is better? Any suggestions will be appreciated! Best regards, Davy -- http://mail.python.org/mailman/listinfo/python-list
ANN: pyregex 0.5 - command line tools for Python's regular expression
pyregex is a command line tools for constructing and testing Python's regular expression. Features includes text highlighting, detail break down of match groups, substitution and a syntax quick reference. It is released in the public domain. Screenshot and download from http://tungwaiyip.info/software/pyregex.html. Wai Yip Tung Usage: pyregex.py [options] "-"|filename regex [replacement [count]] Test Python regular expressions. Specify test data's filename or use "-" to enter test text from console. Optionally specify a replacement text. Options: -f filter mode -n nnn limit to examine the first nnn lines. default no limit. -m show only matched line. default False Regular Expression Syntax Special Characters . matches any character except a newline ^ matches the start of the string $ matches the end of the string or just before the newline at the end of the string * matches 0 or more repetitions of the preceding RE + matches 1 or more repetitions of the preceding RE ? matches 0 or 1 repetitions of the preceding RE {m} exactly m copies of the previous RE should be matched {m,n} matches from m to n repetitions of the preceding RE \ either escapes special characters or signals a special sequence [] indicate a set of characters. Characters can be listed individually, or a range of characters can be indicated by giving two characters and separating them by a "-". Special characters are not active inside sets Including a "^" as the first character match the complement of the set | A|B matches either A or B (...) indicates the start and end of a group (?...) this is an extension notation. See documentation for detail (?iLmsux) I ignorecase; L locale; M multiline; S dotall; U unicode; X verbose *, +, ? and {m,n} are greedy. Append the ? qualifier to match non-greedily. Special Sequences \number matches the contents of the group of the same number. Groups are numbered starting from 1 \A matches only at the start of the string \b matches the empty string at the beginning or end of a word \B matches the empty string not at the beginning or end of a word \d matches any decimal digit \D matches any non-digit character \guse the substring matched by the group named 'name' for sub() \s matches any whitespace character \S matches any non-whitespace character \w matches any alphanumeric character and the underscore \W matches any non-alphanumeric character \Z matches only at the end of the string See the Python documentation on Regular Expression Syntax for more detail http://docs.python.org/lib/re-syntax.html -- http://mail.python.org/mailman/listinfo/python-list