ardief wrote: [...] > And I want the HTML char codes to turn into their equivalent plain > text. I've looked at the newsgroup archives, the cookbook, the web in > general and can't manage to sort it out. I thought doing something like > this - > > file = open('filename', 'r')
It's not a good idea to use 'file' as a variable name, since you are shadowing the builtin type of the same name. > ofile = open('otherfile', 'w') > > done = 0 > > while not done: > line = file.readline() > if 'THE END' in line: > done = 1 > elif '—' in line: > line.replace('—', '--') The replace method doesn't modify the 'line' string, it returns a new string. > ofile.write(line) > else: > ofile.write(line) This should work (untested): infile = open('filename', 'r') outfile = open('otherfile', 'w') for line in infile: outfile.write(line.replace('—', '--')) But I think the best approach is to use a existing aplication or library that solves the problem. recode(1) can easily convert to and from HTML entities: recode html..utf-8 filename Best regards. -- Roberto Bonvallet -- http://mail.python.org/mailman/listinfo/python-list