Kent Johnson wrote:
On Mon, Sep 8, 2008 at 2:46 AM, J. Van Brimmer <[EMAIL PROTECTED]> wrote:
I have a legacy program at work that outputs a text file with this header:
ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
º Radio Source Precession Program º
º by John B. Doe º
º 31 August 1992 º
ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍŒ
Enter Date for Precession as (MM-DD-YYYY) or C/R for 05-28-2004 > 05-28-2004
Enter the Catalog Name or C/R for CATALOG.SRC >
The Julian Date is = 2453153.5
0022+002 5.6564 +0.2713 00:22:37.54 00:16:16.65
0106+013 17.2117 +1.6052 01:08:50.80 01:36:18.58
.
I am trying to write a python script to strip this header (the first five
lines)(these headers) from the file.
As you can see, I can print out the three lines after the strange header
lines, but not the strange character lines. How can I match on those strange
characters? What are they?
The strange characters seem to be box drawing characters from DOS
codepage 437. See
http://www.microsoft.com/globaldev/reference/oem/437.htm
My guess is that the characters in your program are not actually the
same as the characters in the file because they use different
encodings. Try using the hex values for the characters:
if re.search('\xc9\xcd\xcd\xcd', line):
Kent
Thanks Kent, that worked. This is what the output looks like:
$ python srclist.py
???????????????????????????????????????????????????????????????????????????????
Hi there!
Hi there!
Not exactly what I expected, but at least it's recognizing the line, now
I can delete it.
Thanks a million!
Jerry
_______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor