Re: Please help with regular expression finding multiple floats
On Oct 24, 12:00 am, Edward Dolan byteco...@gmail.com wrote: No, you're not missing a thing. I am ;) Something was happening with the triple-quoted strings when I pasted them. Here is hopefully, the correct code.http://codepad.org/OIazr9lA The output is shown on that page as well. Sorry for the line noise folks. One of these days I'm going to learn gnus. Yep now that works. Thanks for the help. Jeremy -- http://mail.python.org/mailman/listinfo/python-list
Re: Please help with regular expression finding multiple floats
No, you're not missing a thing. I am ;) Something was happening with the triple-quoted strings when I pasted them. Here is hopefully, the correct code. http://codepad.org/OIazr9lA The output is shown on that page as well. Sorry for the line noise folks. One of these days I'm going to learn gnus. -- http://mail.python.org/mailman/listinfo/python-list
Re: Please help with regular expression finding multiple floats
On Oct 22, 3:26 pm, Jeremy jlcon...@gmail.com wrote: My question is, how can I use regular expressions to find two OR three or even an arbitrary number of floats without repeating %s? Is this possible? Thanks, Jeremy Any time you have tabular data such as your example, split() is generally the first choice. But since you asked, and I like fscking with regular expressions... import re # I modified your data set just a bit to show that it will # match zero or more space separated real numbers. data = 1.E-08 1.E-08 1.58024E-06 0.0048 1.E-08 1.58024E-06 0.0048 1.E-07 2.98403E-05 0.0018 foo bar baaz 1.E-06 8.85470E-06 0.0026 1.E-05 6.08120E-06 0.0032 1.E-03 1.61817E-05 0.0022 1.E+00 8.34460E-05 0.0014 2.E+00 2.31616E-05 0.0017 5.E+00 2.42717E-05 0.0017 total 1.93417E-04 0.0012 ntuple = re.compile (r # match beginning of line (re.M in the docs) ^ # chew up anything before the first real (non-greedy - ?) .*? # named match (turn the match into a named atom while allowing irrelevant (groups)) (? Pntuple # match one real [-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d +)? # followed by zero or more space separated reals ([ \t]+[-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d+)?) *) # match end of line (re.M in the docs) $ , re.X | re.M) # re.X to allow comments and arbitrary whitespace print [tuple(mo.group('ntuple').split()) for mo in re.finditer(ntuple, data)] Now compare the previous post using split with this one. Even with the comments in the re, it's still a bit difficult to read. Regular expressions are brittle. My code works fine for the data above but if you change the structure the re will probably fail. At that point, you have to fiddle with the re to get it back on course. Don't get me wrong, regular expressions are hella fun to play with. You have to ask yourself, Do I really _need_ to use a regular expression here? -- http://mail.python.org/mailman/listinfo/python-list
Re: Please help with regular expression finding multiple floats
I can see why this line could wrap 1.E-08 1.58024E-06 0.0048 1.E-08 1.58024E-06 0.0048 But this one? 1.E-07 2.98403E-05 0.0018 anyway, here is the code - http://codepad.org/Z7eWBusl -- http://mail.python.org/mailman/listinfo/python-list
Re: Please help with regular expression finding multiple floats
On Oct 23, 3:48 am, Edward Dolan byteco...@gmail.com wrote: On Oct 22, 3:26 pm, Jeremy jlcon...@gmail.com wrote: My question is, how can I use regular expressions to find two OR three or even an arbitrary number of floats without repeating %s? Is this possible? Thanks, Jeremy Any time you have tabular data such as your example, split() is generally the first choice. But since you asked, and I like fscking with regular expressions... import re # I modified your data set just a bit to show that it will # match zero or more space separated real numbers. data = 1.E-08 1.E-08 1.58024E-06 0.0048 1.E-08 1.58024E-06 0.0048 1.E-07 2.98403E-05 0.0018 foo bar baaz 1.E-06 8.85470E-06 0.0026 1.E-05 6.08120E-06 0.0032 1.E-03 1.61817E-05 0.0022 1.E+00 8.34460E-05 0.0014 2.E+00 2.31616E-05 0.0017 5.E+00 2.42717E-05 0.0017 total 1.93417E-04 0.0012 ntuple = re.compile (r # match beginning of line (re.M in the docs) ^ # chew up anything before the first real (non-greedy - ?) .*? # named match (turn the match into a named atom while allowing irrelevant (groups)) (? Pntuple # match one real [-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d +)? # followed by zero or more space separated reals ([ \t]+[-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d+)?) *) # match end of line (re.M in the docs) $ , re.X | re.M) # re.X to allow comments and arbitrary whitespace print [tuple(mo.group('ntuple').split()) for mo in re.finditer(ntuple, data)] Now compare the previous post using split with this one. Even with the comments in the re, it's still a bit difficult to read. Regular expressions are brittle. My code works fine for the data above but if you change the structure the re will probably fail. At that point, you have to fiddle with the re to get it back on course. Don't get me wrong, regular expressions are hella fun to play with. You have to ask yourself, Do I really _need_ to use a regular expression here? In this simplified example I don't really need regular expressions. However I will need regular expressions for more complex problems and I'm trying to become more proficient at using regular expressions. I tried to simplify this so as not to bother the mailing list too much. Thanks for the great suggestion. It looks like it will work fine, but I can't get it to work. I downloaded the simple script you put on http://codepad.org/Z7eWBusl but it only prints an empty list. Am I missing something? Thanks, Jeremy -- http://mail.python.org/mailman/listinfo/python-list
Please help with regular expression finding multiple floats
I have text that looks like the following (but all in one string with '\n' separating the lines): 1.E-08 1.58024E-06 0.0048 1.E-07 2.98403E-05 0.0018 1.E-06 8.85470E-06 0.0026 1.E-05 6.08120E-06 0.0032 1.E-03 1.61817E-05 0.0022 1.E+00 8.34460E-05 0.0014 2.E+00 2.31616E-05 0.0017 5.E+00 2.42717E-05 0.0017 total 1.93417E-04 0.0012 I want to capture the two or three floating point numbers in each line and store them in a tuple. I want to find all such tuples such that I have [('1.E-08', '1.58024E-06', '0.0048'), ('1.E-07', '2.98403E-05', '0.0018'), ('1.E-06', '8.85470E-06', '0.0026'), ('1.E-05', '6.08120E-06', '0.0032'), ('1.E-03', '1.61817E-05', '0.0022'), ('1.E+00', '8.34460E-05', '0.0014'), ('2.E+00', '2.31616E-05', '0.0017'), ('5.E+00', '2.42717E-05', '0.0017') ('1.93417E-04', '0.0012')] as a result. I have the regular expression pattern fp1 = '([-+]?\d*\.?\d+(?:[eE][-+]?\d+)?)\s+' which can find a floating point number followed by some space. I can find three floats with found = re.findall('%s%s%s' %fp1, text) My question is, how can I use regular expressions to find two OR three or even an arbitrary number of floats without repeating %s? Is this possible? Thanks, Jeremy -- http://mail.python.org/mailman/listinfo/python-list
Re: Please help with regular expression finding multiple floats
On Thu, 22 Oct 2009 23:26:01 +0100, Jeremy jlcon...@gmail.com wrote: I have text that looks like the following (but all in one string with '\n' separating the lines): 1.E-08 1.58024E-06 0.0048 [snip] 5.E+00 2.42717E-05 0.0017 total 1.93417E-04 0.0012 I want to capture the two or three floating point numbers in each line and store them in a tuple. I want to find all such tuples such that I have [('1.E-08', '1.58024E-06', '0.0048'), [snip] ('5.E+00', '2.42717E-05', '0.0017') ('1.93417E-04', '0.0012')] as a result. I have the regular expression pattern fp1 = '([-+]?\d*\.?\d+(?:[eE][-+]?\d+)?)\s+' which can find a floating point number followed by some space. Hmm. Is .01 really valid? Oh well, let's assume so. I'd seriously recommend using a raw string r'' to define fp1 with though; it's a good habit to get into with regular expressions, and when (not if) the fact that none of your backslashes are escaped matters, you won't waste hours wondering what just bit you. I can find three floats with found = re.findall('%s%s%s' %fp1, text) My question is, how can I use regular expressions to find two OR three or even an arbitrary number of floats without repeating %s? Is this possible? Yes. On the off-chance that this is homework, I'll just observe that the only difference between detecting repeated digits (say) and repeated float-expressions is exactly what you apply the repetition operators to. The documentation for the 're' module at python.org is your friend! -- Rhodri James *-* Wildebeest Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Please help with regular expression finding multiple floats
I have text that looks like the following (but all in one string with '\n' separating the lines): I want to capture the two or three floating point numbers in each line and store them in a tuple. I have the regular expression pattern Jeremy For a non-regular-expression solution you might consider something simlar to the following s = '''\ 1.E-08 1.58024E-06 0.0048 1.E-07 2.98403E-05 0.0018 1.E-06 8.85470E-06 0.0026 1.E-05 6.08120E-06 0.0032 1.E-03 1.61817E-05 0.0022 1.E+00 8.34460E-05 0.0014 2.E+00 2.31616E-05 0.0017 5.E+00 2.42717E-05 0.0017 total 1.93417E-04 0.0012''' l1 = s.split( '\n' ) l2 = [ ] for this_row in l1[ : -1 ] : temp = this_row.strip().split() l2.append( [ float( x ) for x in temp ] ) last = l1[ -1 ].strip().split()[ 1 : ] l2.append( [ float( x ) for x in last ] ) print for this_row in l2 : if len( this_row ) 2 : x , y , z = this_row print '%5.4e %5.4e %5.4e ' % ( x , y , z ) else : x , y = this_row print '%5.4e %5.4e ' % ( x , y ) -- Stanley C. Kitching Human Being Phoenix, Arizona -- http://mail.python.org/mailman/listinfo/python-list