Re: reusing parts of a string in RE matches?

2006-05-13 Thread Mirco Wahab
Hi Fredrik you brought up some terse and somehow expressive lines with their own beauty ... [this] is best done by a list comprehension: l = [m[1] for m in re.findall(r, t)] or, [...] a generator expression: g = (m[1] for m in re.findall(r, t)) or process(m[1] for m in

Re: reusing parts of a string in RE matches?

2006-05-12 Thread Fredrik Lundh
Mirco Wahab wrote: In Python, you have to deconstruct the 2D-lists (here: long list of short lists [a,2] ...) by 'slicing the slice': char,num = list[:][:] in a loop and using the apropriate element then: import re t = 'a1a2a3Aa4a35a6b7b8c9c'; r = r'(\w)(?=(.)\1)' l

Re: reusing parts of a string in RE matches?

2006-05-11 Thread Mirco Wahab
Hi mpeters42 John With a more complex pattern (like 'a.a': match any character between two 'a' characters) this will get the length, but not what character is between the a's. Lets take this as a starting point for another example that comes to mind. You have a string of characters

Re: reusing parts of a string in RE matches?

2006-05-11 Thread John Salerno
Mirco Wahab wrote: Py: import re tx = 'a1a2a3A4a35a6b7b8c9c' rg = r'(\w)(?=(.\1))' print re.findall(rg, tx) The only problem seems to be (and I ran into this with my original example too) that what gets returned by this code isn't exactly what you are looking for, i.e. the numbers

Re: reusing parts of a string in RE matches?

2006-05-11 Thread John Salerno
Ben Cartwright wrote: Yes, and no extra for loops are needed! You can define groups inside the lookahead assertion: import re re.findall(r'(?=(aba))', 'abababababababab') ['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba'] Wow, that was like magic! :) --

Re: reusing parts of a string in RE matches?

2006-05-11 Thread Mirco Wahab
Hi John rg = r'(\w)(?=(.)\1)' That would at least isolate the number, although you'd still have to get it out of the list/tuple. I have no idea how to do this in Python in a terse way - but I'll try ;-) In Perl, its easy. Here, the match construct (\w)(?=(.)\1) returns all captures in a

Re: reusing parts of a string in RE matches?

2006-05-11 Thread John Salerno
Mirco Wahab wrote: I have no idea how to do this in Python in a terse way - but I'll try ;-) In Perl, its easy. Here, the match construct (\w)(?=(.)\1) returns all captures in a list (a 1 a 2 a 4 b 7 c 9) Ah, I see the difference. In Python you get a list of tuples, so there seems to

Re: reusing parts of a string in RE matches?

2006-05-11 Thread Mirco Wahab
Hi John Ah, I see the difference. In Python you get a list of tuples, so there seems to be a little extra work to do to get the number out. Dohh, after two cups of coffee ans several bars of chocolate I eventually mad(e) it ;-) In Python, you have to deconstruct the 2D-lists (here: long list

Re: reusing parts of a string in RE matches?

2006-05-11 Thread Mirco Wahab
Hi John Ah, I see the difference. In Python you get a list of tuples, so there seems to be a little extra work to do to get the number out. Dohh, after two cups of coffee ans several bars of chocolate I eventually mad(e) it ;-) In Python, you have to deconstruct the 2D-lists (here: long

reusing parts of a string in RE matches?

2006-05-10 Thread John Salerno
I probably should find an RE group to post to, but my news server at work doesn't seem to have one, so I apologize. But this is in Python anyway :) So my question is, how can find all occurrences of a pattern in a string, including overlapping matches? I figure it has something to do with

Re: reusing parts of a string in RE matches?

2006-05-10 Thread Murali
John Salerno wrote: So my question is, how can find all occurrences of a pattern in a string, including overlapping matches? I figure it has something to do with look-ahead and look-behind, but I've only gotten this far: import re string = 'abababababababab' pattern =

Re: reusing parts of a string in RE matches?

2006-05-10 Thread Bo Yang
John Salerno 写道: I probably should find an RE group to post to, but my news server at work doesn't seem to have one, so I apologize. But this is in Python anyway :) So my question is, how can find all occurrences of a pattern in a string, including overlapping matches? I figure it has

Re: reusing parts of a string in RE matches?

2006-05-10 Thread BartlebyScrivener
Right about now somebody usually jumps in and shows you how to do this without using regex and using string methods instead. I'll watch. rd -- http://mail.python.org/mailman/listinfo/python-list

Re: reusing parts of a string in RE matches?

2006-05-10 Thread John Salerno
Bo Yang wrote: This matches all the 'ab' followed by an 'a', but it doesn't include the 'a'. What I'd like to do is find all the 'aba' matches. A regular findall() gives four results, but really there are seven. I try the code , but I give seven results ! Sorry, I meant that findall()

Re: reusing parts of a string in RE matches?

2006-05-10 Thread John Salerno
BartlebyScrivener wrote: Right about now somebody usually jumps in and shows you how to do this without using regex and using string methods instead. I'll watch. rd Heh heh, I'm sure you're right, but this is more just an exercise for me in REs, so I'm curious how you might do it,

Re: reusing parts of a string in RE matches?

2006-05-10 Thread BartlebyScrivener
I have to at least try :) s = abababababababab for x in range(len(s)): ... try: ... s.index(aba, x, x + 3) ... except ValueError: ... pass rd -- http://mail.python.org/mailman/listinfo/python-list

Re: reusing parts of a string in RE matches?

2006-05-10 Thread John Salerno
BartlebyScrivener wrote: I have to at least try :) s = abababababababab for x in range(len(s)): ... try: ... s.index(aba, x, x + 3) ... except ValueError: ... pass rd yeah, looks like index() or find() can be used to do it instead of RE, but still, i'd

Re: reusing parts of a string in RE matches?

2006-05-10 Thread mpeters42
From the Python 2.4 docs: findall( pattern, string[, flags]) Return a list of all ***non-overlapping*** matches of pattern in string By design, the regex functions return non-overlapping patterns. Without doing some kind of looping, I think you are out of luck. If you pattern is fixed,

Re: reusing parts of a string in RE matches?

2006-05-10 Thread BartlebyScrivener
otherwise i might as well just use string methods I think you're supposed to use string methods if you can, to avoid the old adage about having two problems instead of one when using regex. rd -- http://mail.python.org/mailman/listinfo/python-list

Re: reusing parts of a string in RE matches?

2006-05-10 Thread John Salerno
[EMAIL PROTECTED] wrote: string = 'abababababababab' pat = 'aba' [pat for s in re.compile('(?='+pat+')').findall(string)] ['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba'] Wow, I have no idea how to read that RE. First off, what does it match? Should something come before the parentheses,

Re: reusing parts of a string in RE matches?

2006-05-10 Thread John Salerno
John Salerno wrote: [EMAIL PROTECTED] wrote: string = 'abababababababab' pat = 'aba' [pat for s in re.compile('(?='+pat+')').findall(string)] ['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba'] Wow, I have no idea how to read that RE. First off, what does it match? Should something come

Re: reusing parts of a string in RE matches?

2006-05-10 Thread Kent Johnson
John Salerno wrote: I probably should find an RE group to post to, but my news server at work doesn't seem to have one, so I apologize. But this is in Python anyway :) So my question is, how can find all occurrences of a pattern in a string, including overlapping matches? You can

Re: reusing parts of a string in RE matches?

2006-05-10 Thread mpeters42
Exactly, Now this will work as long as there are no wildcards in the pattern. Thus, only with fixed strings. But if you have a fixed string, there is really no need to use regex, as it will complicate you life for no real reason (as opposed to simple string methods). With a more complex pattern

Re: reusing parts of a string in RE matches?

2006-05-10 Thread Ben Cartwright
John Salerno wrote: So my question is, how can find all occurrences of a pattern in a string, including overlapping matches? I figure it has something to do with look-ahead and look-behind, but I've only gotten this far: import re string = 'abababababababab' pattern = re.compile(r'ab(?=a)')

Re: reusing parts of a string in RE matches?

2006-05-10 Thread Murali
Yes, and no extra for loops are needed! You can define groups inside the lookahead assertion: import re re.findall(r'(?=(aba))', 'abababababababab') ['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba'] Wonderful and this works with any regexp, so import re def

Re: reusing parts of a string in RE matches?

2006-05-10 Thread Ben Cartwright
Murali wrote: Yes, and no extra for loops are needed! You can define groups inside the lookahead assertion: import re re.findall(r'(?=(aba))', 'abababababababab') ['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba'] Wonderful and this works with any regexp, so import re def

Re: reusing parts of a string in RE matches?

2006-05-10 Thread BartlebyScrivener
Thanks, Ben. Quite an education! rick -- http://mail.python.org/mailman/listinfo/python-list