On Oct 22, 3:26 pm, Jeremy <jlcon...@gmail.com> wrote: > My question is, how can I use regular expressions to find two OR three > or even an arbitrary number of floats without repeating %s? Is this > possible? > > Thanks, > Jeremy
Any time you have tabular data such as your example, split() is generally the first choice. But since you asked, and I like fscking with regular expressions... import re # I modified your data set just a bit to show that it will # match zero or more space separated real numbers. data = """ 1.0000E-08 1.0000E-08 1.58024E-06 0.0048 1.0000E-08 1.58024E-06 0.0048 1.0000E-07 2.98403E-05 0.0018 foo bar baaz 1.0000E-06 8.85470E-06 0.0026 1.0000E-05 6.08120E-06 0.0032 1.0000E-03 1.61817E-05 0.0022 1.0000E+00 8.34460E-05 0.0014 2.0000E+00 2.31616E-05 0.0017 5.0000E+00 2.42717E-05 0.0017 total 1.93417E-04 0.0012 """ ntuple = re.compile (r""" # match beginning of line (re.M in the docs) ^ # chew up anything before the first real (non-greedy - > ?) .*? # named match (turn the match into a named atom while allowing irrelevant (groups)) (? P<ntuple> # match one real [-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d +)? # followed by zero or more space separated reals ([ \t]+[-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d+)?) *) # match end of line (re.M in the docs) $ """, re.X | re.M) # re.X to allow comments and arbitrary whitespace print [tuple(mo.group('ntuple').split()) for mo in re.finditer(ntuple, data)] Now compare the previous post using split with this one. Even with the comments in the re, it's still a bit difficult to read. Regular expressions are brittle. My code works fine for the data above but if you change the structure the re will probably fail. At that point, you have to fiddle with the re to get it back on course. Don't get me wrong, regular expressions are hella fun to play with. You have to ask yourself, "Do I really _need_ to use a regular expression here?" -- http://mail.python.org/mailman/listinfo/python-list