[Tutor] How to match strange characters

J. Van Brimmer Sun, 07 Sep 2008 23:47:58 -0700

I have a legacy program at work that outputs a text file with this header:

ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»

º Radio Source Precession Program º
º by John B. Doe º
º 31 August 1992 º
ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍŒ

Enter Date for Precession as (MM-DD-YYYY) or C/R for 05-28-2004 >05-28-2004

Enter the Catalog Name or C/R for CATALOG.SRC >
The Julian Date is = 2453153.5
0022+002 5.6564 +0.2713 00:22:37.54 00:16:16.65
0106+013 17.2117 +1.6052 01:08:50.80 01:36:18.58
.
.
.
much more regular integer data lines to the end of the section.

One section is created each time the program is run. Each section hasone of these headers. Each section is appended to the end of the fileevery time the program is run. So that each new header follows the lastdata line in the previous section.

I am trying to write a python script to strip this header (the firstfive lines)(these headers) from the file. The name of this legacyprogram is PRECESS. Every time we run PRECESS, this header is repeated,not just at the top.


Here's my code so far:

(code)
import re

def main():

f = open('/home/jerry/sepoct08.txt', 'r') # sepoct08.txt is the PRECESSoutput file

for line in f:
if re.search('ÉÍÍÍ', line):
print line
elif re.search('> ..-..-....', line): # this line prints out
print line
elif re.search('Catalog', line): # this line prints out
print line
elif re.search('Julian', line): # this line prints out
print line
print "Hi there!" # I print out this just so I know my script is looping

f.close()

if __name__ == "__main__":
main()
(/code)

Here's the output from my code:

(output)
Hi there!
Hi there!
Hi there!
Hi there!
Hi there!
Enter Date for Precession as (MM-DD-YYYY) or C/R for 05-28-2004 > 05-28-2004

Hi there!
Enter the Catalog Name or C/R for CATALOG.SRC >

Hi there!
The Julian Date is = 2453153.5

Hi there!
Hi there!
.
.
.
. end of file
(/output)

As you can see, I can print out the three lines after the strange headerlines, but not the strange character lines. How can I match on thosestrange characters? What are they?

I'm just trying to figure out how to print out each line from the headerfirst, then later I will modify the code to process those lines asneeded. My problem is those strange characters in the top part of theheader. The re module doesn't recognize them. How can I match on them,so I can delete those lines? I can't do it by line number because theyaren't recognized

The original PRECESS code cannot be modified. So, short of rewriting thePRECESS program, I thought it would be easy to modify the output asneeded. I'm pretty sure PRECESS is written in C.

Sorry for the long post, I tried to only include the relevantinformation. Please fire away with questions and comments.



TIA,
Jerry


_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

[Tutor] How to match strange characters

Reply via email to