Re: [Tutor] regex and parsing through a semi-csv file

Mina Nozar Mon, 24 Oct 2011 15:41:48 -0700

Hi Marc,

Thank you. Following some of your suggestion, the rewrite below worked. I agree with your point on readability overcomplexity. By grace I meant not convoluted or simpler. That's all. As a beginner, I find not knowing all theexisting functions, I end up re-inventing the wheel sometimes.



Cheers,
Mina
====

isotope_name,isotope_A = args.isotope.split('-')
print isotope_name, isotope_A

found_isotope = False
activity_time = []
activity = []
activity_err = []


f = open(args.fname, 'r')
lines = f.readlines()
f.close()

for i, line in enumerate(lines):
        line = line.strip()
        if isotope_name in line and isotope_A in line:
                found_isotope = True
                print 'found isotope'
                #print line
                lines = lines[i+1:]
                break
        
for line in lines:
        line = line.strip()
        if not line[0].isdigit():
                break
        print 'found'
        words = line.split(',')
        activity_time.append(float(words[0]))
        activity.append(float(words[1]))
        activity_err.append(float(words[2]))    
        
On 11-10-19 12:06 PM, Marc Tompkins wrote:

On Wed, Oct 5, 2011 at 11:12 AM, Mina Nozar <noz...@triumf.ca 
<mailto:noz...@triumf.ca>> wrote:

    Now, I would like to parse through this code and fill out 3 lists: 1) 
activity_time, 2) activity, 3) error, and plot
    the activities as a function of time using matplotlip.  My question 
specifically is on how to parse through the
    lines containing the data (activity time, activity, error) for a given 
isotope, stopping before reaching the next
    isotope's info.


Regular expressions certainly are terse, but (IMHO) they're really, really hard 
to debug and maintain; I find I have to
get myself into a Zen state to even unpack them, and that just doesn't feel 
very Pythonic.

Here's an approach I've used in similar situations (a file with arbitrary 
sequences of differently-formatted lines,
where one line determines the "type" of the lines that follow):
-  create a couple of status variables: currentElement, currentIsotope
-  read each line and split it into a list, separating on the commas
-  look at the first item on the line: is it an element?  (You could use a list 
of the 120 symbols, or you could just
check to see if it's alphabetic...)
   -  if the first item is an element, then set currentElement and 
currentIsotope, move on to next line.
-  if the first item is NOT an element, then this is a data line.
   -  if currentElement and currentIsotope match what the user asked for,
      -  add time, activity, and error to the appropriate lists
   - if not, move on.

This approach also works in the event that the data wasn't all collected in 
order - i.e. there might be data for Ag111
followed by U235 followed by Ag111 again.

    Note that the size of the lists will change depending on the number of 
activities for a given run of the simulation
    so I don't want to hard code '13' as the number of lines to read in 
followed by the line containing isotope_name, etc.


This should work for any number of lines or size of file, as long as the data 
lines are all formatted as you expect.
Obviously a bit of error-trapping would be a good thing....

    If there is a more graceful way of doing this, please let me know as well.  
I am new to python...

For me, readability and maintainability trump "grace" every time.  Nobody's 
handing out awards for elegance (outside of
the classroom), but complexity gets punished (with bugs and wasted time.)  More 
elegant solutions might also run faster,
but remember that premature optimization is a Bad Thing.



_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] regex and parsing through a semi-csv file

Reply via email to