Re: Please help with regular expression finding multiple floats

2009-10-26 Thread Jeremy
On Oct 24, 12:00 am, Edward Dolan byteco...@gmail.com wrote:
 No, you're not missing a thing. I am ;) Something was happening with
 the triple-quoted
 strings when I pasted them. Here is hopefully, the correct 
 code.http://codepad.org/OIazr9lA
 The output is shown on that page as well.

 Sorry for the line noise folks. One of these days I'm going to learn
 gnus.

Yep now that works.  Thanks for the help.
Jeremy
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Please help with regular expression finding multiple floats

2009-10-24 Thread Edward Dolan
No, you're not missing a thing. I am ;) Something was happening with
the triple-quoted
strings when I pasted them. Here is hopefully, the correct code.
http://codepad.org/OIazr9lA
The output is shown on that page as well.

Sorry for the line noise folks. One of these days I'm going to learn
gnus.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Please help with regular expression finding multiple floats

2009-10-23 Thread Edward Dolan
On Oct 22, 3:26 pm, Jeremy jlcon...@gmail.com wrote:
 My question is, how can I use regular expressions to find two OR three
 or even an arbitrary number of floats without repeating %s?  Is this
 possible?

 Thanks,
 Jeremy

Any time you have tabular data such as your example, split() is
generally the first choice. But since you asked, and I like fscking
with regular expressions...

import re

# I modified your data set just a bit to show that it will
# match zero or more space separated real numbers.

data =

1.E-08

1.E-08 1.58024E-06 0.0048 1.E-08 1.58024E-06
0.0048
1.E-07 2.98403E-05
0.0018
foo bar
baaz
1.E-06 8.85470E-06
0.0026
1.E-05 6.08120E-06
0.0032
1.E-03 1.61817E-05
0.0022
1.E+00 8.34460E-05
0.0014
2.E+00 2.31616E-05
0.0017
5.E+00 2.42717E-05
0.0017
total 1.93417E-04
0.0012


ntuple = re.compile
(r
# match beginning of line (re.M in the
docs)
^
# chew up anything before the first real (non-greedy -
 ?)
.*?
# named match (turn the match into a named atom while allowing
irrelevant (groups))
(?
Pntuple
  # match one
real
  [-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d
+)?
  # followed by zero or more space separated
reals
  ([ \t]+[-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d+)?)
*)
# match end of line (re.M in the
docs)
$
, re.X | re.M) # re.X to allow comments and arbitrary
whitespace

print [tuple(mo.group('ntuple').split())
   for mo in re.finditer(ntuple, data)]

Now compare the previous post using split with this one. Even with the
comments in the re, it's still a bit difficult to read. Regular
expressions
are brittle. My code works fine for the data above but if you change
the
structure the re will probably fail. At that point, you have to fiddle
with
the re to get it back on course.

Don't get me wrong, regular expressions are hella fun to play with.
You have
to ask yourself, Do I really _need_ to use a regular expression here?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Please help with regular expression finding multiple floats

2009-10-23 Thread Edward Dolan
I can see why this line could wrap
 1.E-08 1.58024E-06 0.0048 1.E-08 1.58024E-06
 0.0048
But this one?
 1.E-07 2.98403E-05
 0.0018

anyway, here is the code - http://codepad.org/Z7eWBusl
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Please help with regular expression finding multiple floats

2009-10-23 Thread Jeremy
On Oct 23, 3:48 am, Edward Dolan byteco...@gmail.com wrote:
 On Oct 22, 3:26 pm, Jeremy jlcon...@gmail.com wrote:

  My question is, how can I use regular expressions to find two OR three
  or even an arbitrary number of floats without repeating %s?  Is this
  possible?

  Thanks,
  Jeremy

 Any time you have tabular data such as your example, split() is
 generally the first choice. But since you asked, and I like fscking
 with regular expressions...

 import re

 # I modified your data set just a bit to show that it will
 # match zero or more space separated real numbers.

 data =
 
 1.E-08

 1.E-08 1.58024E-06 0.0048 1.E-08 1.58024E-06
 0.0048
 1.E-07 2.98403E-05
 0.0018
 foo bar
 baaz
 1.E-06 8.85470E-06
 0.0026
 1.E-05 6.08120E-06
 0.0032
 1.E-03 1.61817E-05
 0.0022
 1.E+00 8.34460E-05
 0.0014
 2.E+00 2.31616E-05
 0.0017
 5.E+00 2.42717E-05
 0.0017
 total 1.93417E-04
 0.0012
 

 ntuple = re.compile
 (r
 # match beginning of line (re.M in the
 docs)
 ^
 # chew up anything before the first real (non-greedy - ?)

 .*?
 # named match (turn the match into a named atom while allowing
 irrelevant (groups))
 (?
 Pntuple
   # match one
 real
   [-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d
 +)?
   # followed by zero or more space separated
 reals
   ([ \t]+[-+]?(\d*\.\d+|\d+\.\d*)([eE][-+]?\d+)?)
 *)
 # match end of line (re.M in the
 docs)
 $
 , re.X | re.M) # re.X to allow comments and arbitrary
 whitespace

 print [tuple(mo.group('ntuple').split())
        for mo in re.finditer(ntuple, data)]

 Now compare the previous post using split with this one. Even with the
 comments in the re, it's still a bit difficult to read. Regular
 expressions
 are brittle. My code works fine for the data above but if you change
 the
 structure the re will probably fail. At that point, you have to fiddle
 with
 the re to get it back on course.

 Don't get me wrong, regular expressions are hella fun to play with.
 You have
 to ask yourself, Do I really _need_ to use a regular expression here?

In this simplified example I don't really need regular expressions.
However I will need regular expressions for more complex problems and
I'm trying to become more proficient at using regular expressions.  I
tried to simplify this so as not to bother the mailing list too much.

Thanks for the great suggestion.  It looks like it will work fine, but
I can't get it to work.  I downloaded the simple script you put on
http://codepad.org/Z7eWBusl  but it only prints an empty list.  Am I
missing something?

Thanks,
Jeremy
-- 
http://mail.python.org/mailman/listinfo/python-list


Please help with regular expression finding multiple floats

2009-10-22 Thread Jeremy
I have text that looks like the following (but all in one string with
'\n' separating the lines):

1.E-08   1.58024E-06 0.0048
1.E-07   2.98403E-05 0.0018
1.E-06   8.85470E-06 0.0026
1.E-05   6.08120E-06 0.0032
1.E-03   1.61817E-05 0.0022
1.E+00   8.34460E-05 0.0014
2.E+00   2.31616E-05 0.0017
5.E+00   2.42717E-05 0.0017
  total  1.93417E-04 0.0012

I want to capture the two or three floating point numbers in each line
and store them in a tuple.  I want to find all such tuples such that I
have
[('1.E-08', '1.58024E-06', '0.0048'),
 ('1.E-07', '2.98403E-05', '0.0018'),
 ('1.E-06', '8.85470E-06', '0.0026'),
 ('1.E-05', '6.08120E-06', '0.0032'),
 ('1.E-03', '1.61817E-05', '0.0022'),
 ('1.E+00', '8.34460E-05', '0.0014'),
 ('2.E+00', '2.31616E-05', '0.0017'),
 ('5.E+00', '2.42717E-05', '0.0017')
 ('1.93417E-04', '0.0012')]

as a result.  I have the regular expression pattern

fp1 = '([-+]?\d*\.?\d+(?:[eE][-+]?\d+)?)\s+'

which can find a floating point number followed by some space.  I can
find three floats with

found = re.findall('%s%s%s' %fp1, text)

My question is, how can I use regular expressions to find two OR three
or even an arbitrary number of floats without repeating %s?  Is this
possible?

Thanks,
Jeremy




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Please help with regular expression finding multiple floats

2009-10-22 Thread Rhodri James

On Thu, 22 Oct 2009 23:26:01 +0100, Jeremy jlcon...@gmail.com wrote:


I have text that looks like the following (but all in one string with
'\n' separating the lines):

1.E-08   1.58024E-06 0.0048

[snip]

5.E+00   2.42717E-05 0.0017
  total  1.93417E-04 0.0012

I want to capture the two or three floating point numbers in each line
and store them in a tuple.  I want to find all such tuples such that I
have
[('1.E-08', '1.58024E-06', '0.0048'),

[snip]

 ('5.E+00', '2.42717E-05', '0.0017')
 ('1.93417E-04', '0.0012')]

as a result.  I have the regular expression pattern

fp1 = '([-+]?\d*\.?\d+(?:[eE][-+]?\d+)?)\s+'

which can find a floating point number followed by some space.


Hmm.  Is .01 really valid?  Oh well, let's assume so.  I'd
seriously recommend using a raw string r'' to define fp1
with though; it's a good habit to get into with regular expressions,
and when (not if) the fact that none of your backslashes are
escaped matters, you won't waste hours wondering what just bit you.


 I can
find three floats with

found = re.findall('%s%s%s' %fp1, text)

My question is, how can I use regular expressions to find two OR three
or even an arbitrary number of floats without repeating %s?  Is this
possible?


Yes.  On the off-chance that this is homework, I'll just observe that
the only difference between detecting repeated digits (say) and repeated
float-expressions is exactly what you apply the repetition operators to.
The documentation for the 're' module at python.org is your friend!

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list


Re: Please help with regular expression finding multiple floats

2009-10-22 Thread Cousin Stanley

 I have text that looks like the following 
 (but all in one string with '\n' separating the lines):
 

 I want to capture the two or three floating point numbers in each line
 and store them in a tuple.  
 
 I have the regular expression pattern
 

Jeremy  

  For a non-regular-expression solution
  you might consider something simlar to
  the following 

s = '''\
1.E-08   1.58024E-06 0.0048
1.E-07   2.98403E-05 0.0018
1.E-06   8.85470E-06 0.0026
1.E-05   6.08120E-06 0.0032
1.E-03   1.61817E-05 0.0022
1.E+00   8.34460E-05 0.0014
2.E+00   2.31616E-05 0.0017
5.E+00   2.42717E-05 0.0017
  total  1.93417E-04 0.0012'''

l1 = s.split( '\n' )

l2 = [ ]

for this_row in l1[ : -1 ] :
temp = this_row.strip().split()
l2.append( [ float( x ) for x in temp ] )

last = l1[ -1 ].strip().split()[ 1 : ]

l2.append( [ float( x ) for x in last ] )

print
for this_row in l2 :
if len( this_row )  2 :
x , y , z = this_row
print '%5.4e  %5.4e  %5.4e ' % ( x , y , z )
else :
x , y = this_row
print '%5.4e  %5.4e ' % ( x , y )



-- 
Stanley C. Kitching
Human Being
Phoenix, Arizona

-- 
http://mail.python.org/mailman/listinfo/python-list