Jay Mutter III wrote: > "Jay Mutter III" <jmutter at uakron.edu > <http://mail.python.org/mailman/listinfo/tutor>> wrote > >>/ See example// //next:/ >>/ A.-C. Manufacturing Company. (See Sebastian, A. A.,/ >>/ and Capes, assignors.)/ >>/.../ >>/Aaron, Solomon E., Boston, Mass. Pliers. No. 1,329,155 ;/ >>/Jan. 27 ; v. 270 ; p. 554./ >> >>/ For instance, I would like to go to end of line and if last/ >>/ character// //is a comma or semicolon or hyphen then/ >>/ remove the CR./ > > It would look something like: > > output = open('example.fixed','w') > for line in file('example.txt'): > if line[-1] in ',;-': # check last character
The last character will always be a newline; try if len(line) > 1 and line[-2] in ';,-': instead. > line = line.strip() # lose the C/R This will also lose any leading or trailing whitespace. line.rstrip() would be safer. > output.write(line) # write to output > else: output.write(line) # append the next line complete with C/R > output.close() > > > > > Working from the above suggestion ( and thank you very much - i did > enjoy your online tutorial) > I came up with the following: > > import os > import sys > import re > import string You don't need any of the above. > > # The next 5 lines are so I have an idea of how many lines i started > with in the file. > > in_filename = raw_input('What is the COMPLETE name of the file you want > to open: ') > in_file = open(in_filename, 'r') > text = in_file.read() As Luke pointed out, you should use readlines() here. > num_lines = text.count('\n') > print 'There are', num_lines, 'lines in the file', in_filename > > output = open("cleandata.txt","a") # file for writing data to after > stripping newline character > > # read file, copying each line to new file > for line in text: > if line[:-1] in '-': > line = line.rstrip() > output.write(line) > else: output.write(line) Since line is a single character, line[:-1] is always an empty string and the condition will always be true. What this loop does is strip all the whitespace out of your file. > > print "Data written to cleandata.txt." > > # close the files > in_file.close() > output.close() > > The above ran with no erros, gave me the number of lines in my orginal > file but then when i opened the cleandata.txt file > I got: > > A.-C.䴀愀渀甀昀愀挀琀甀爀椀渀最�Company.⠀匀攀攀�Sebastian,䄀⸀�A.,�and > 䌀愀瀀攀猀Ⰰ�assignors.)�A.䜀⸀�A.刀愀椀氀眀愀礀�Light☀�Signal䌀漀⸀�(See > 䴀攀搀攀渀Ⰰ�Elof�Hassignor.)�A-N䌀漀洀瀀愀渀礀Ⰰ�The.⠀匀攀攀�Alexander愀 > 渀搀�Nasb,愀猀ⴀ�猀椀最渀漀爀猀⸀㬀�䄀一�Company,吀栀攀⸀�(See一愀猀栀Ⰰ�It. > 䨀⸀Ⰰ�and䄀氀攀砀愀渀搀攀爀Ⰰ�as-� This is mysterious. What is the original data? What OS are you running on? How did you view the file? Kent > > So what did I do to cause all of the strange characters???? > Plus since this goes on it is as if it removed all \n and not just the > ones after a hyphen which I was using as my test case. > > Thanks again. > > Jay > > > >>/ Then move line by line through the file and delete everything/ >>/ after a// //numerical sequence/ > > Slightly more tricky because you need to use a regular expression. > But if you know regex then only slightly. > >>/ //I am wondering if Python would be a good tool/ > > Absolutely, its one of the areas where Python excels. > >>/ find information on how to accomplish this/ > > You could check my tutorial on the three topics: > > Handling text > Handling files > Regular Expressions. > > Also the standard python documentation for the general tutorial > (assuming you've done basic programming in some other language > before) plus the re module > >>/ using something like the unix tool awk or something else??/ > > awk or sed could both be used, but Python is more generally > useful so unless you already know awk I'd take the time to > learn the basics of Python (a few hours maybe) and use that. > > -- > Alan Gauld > Author of the Learn to Program web site > http://www.freenetpages.co.uk/hp/alan.gauld > > > ------------------------------------------------------------------------ > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor