Re: Newby: how to transform text into lines of text

Tim Chase Sun, 25 Jan 2009 15:35:12 -0800

One other caveat here, "line" contains the newline at the end, so
you might have


  print line.rstrip('\r\n')

to remove them.


I don't understand the presence of the '\r' there. Any '\x0d' that
remains after reading the file in text mode and is removed by that
rstrip would be a strange occurrence in the data which the OP may
prefer to find out about and deal with; it is not part of "the
newline". Why suppress one particular data character in preference to
others?

In an ideal world where everybody knew how to make a propertext-file, it wouldn't be an issue. Recreating the form of someof the data I get from customers/providers:


 >>> f = file('tmp/x.txt', 'wb')
 >>> f.write('headers\n')  # headers in Unix format
 >>> f.write('data1\r\n')  # data in Dos format
 >>> f.write('data2\r\n')
 >>> f.write('data3')   # no trailing newline of any sort
 >>> f.close()

Then reading it back in:

 >>> for line in file('tmp/x.txt'): print repr(line)
 ...
 'headers\n'
 'data1\r\n'
 'data2\r\n'
 'data3'

As for wanting to know about stray '\r' characters, I only wantthe data -- I don't particularly like to be reminded of theincompetence of those who send me malformed text-files ;-)

The same applies in any case to the use of rstrip('\n'); if that finds
more than one ocurrence of '\x0a' to remove, it has exceeded the
mandate of removing the newline (if any).

I believe that using the formulaic "for line in file(FILENAME)"iteration guarantees that each "line" will have at most only one'\n' and it will be at the end (again, a malformed text-file withno terminal '\n' may cause it to be absent from the last line)

So, we are left with the unfortunately awkward
    if line.endswith('\n'):
        line = line[:-1]

You're welcome to it, but I'll stick with my more DWIM solutionof "get rid of anything that resembles an attempt at a CR/LF".

Thank goodness I haven't found any of my data-sources using"\n\r" instead, which would require me to left-strip '\r'characters as well. Sigh. My kingdom for competency. :-/


-tkc





--
http://mail.python.org/mailman/listinfo/python-list

Re: Newby: how to transform text into lines of text

Reply via email to