Re: Regular expression worries
CSUIDL PROGRAMMEr wrote: > folks > I am new to python, so excuse me if i am asking stupid questions. >From what I see, you seem to be new to programming in general !-) > I have a txt file and here are some lines of it > > Document Keyword > Keyword Keyword > Keyword > Keyword Keyword Keyword > Keyword Keyword Keyword > I am writing a python program to replace the tags and word Document > with Doc. > > Here is my python program > > #! /usr/local/bin/python > > import sys > import string > import re > > def replace(): > filename='/root/Desktop/project/chatlog_20060819_110043.xml.txt' > try: > fh=open(filename,'r') > except: > print 'file not opened' > sys.exit(1) You open your file a first time, and bind the reference to the file object to fh. > for l in > open('/root/Desktop/project/chatlog_20060819_110043.xml.txt'): And then you open the file a second time... > l=l.replace("Document", "DOC") This modifies the string referenced by l (talk about a bad name) and rebind to the same name > fh.close() Then you close fh... and discard the modifications to l. > if __name__=="__main__": > replace() > > But it does not replace Document with Doc in the txt file Why should it ? You didn't asked for it !-) > Is there anything wrong i am doing Yes. The canonical way to modify a text file is to read from original / do transformations / *write modifications to a tmp file* / replace the original with the tmp file. -- bruno desthuilliers python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for p in '[EMAIL PROTECTED]'.split('@')])" -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular expression worries
> for l in > open('/root/Desktop/project/chatlog_20060819_110043.xml.txt'): > > l=l.replace("Document", "DOC") > fh.close() > > But it does not replace Document with Doc in the txt file In addition to closing the file handle for the loop *within* the loop, you're changing "l" (side note: a bad choice of names, as in most fonts, it's difficult to visually discern from the number "1"), but you're not writing it back out any place. One would do something like outfile = open('out.txt', 'w') infile = open(filename) for line in infile: outfile.write(line.replace("Document", "DOC")) outfile.close() infile.close() You could even let garbage collection take care of the file handle for you: outfile = open('out.txt', 'w') for line in open(filename): outfile.write(line.replace("Document", "DOC")) outfile.close() If needed, you can then move the 'out.txt' overtop of the original file. Or, you could just use sed 's/Document/DOC/g' $FILENAME > out.txt or with an accepting version, do it in-place with sed -i 's/Document/DOC/g' $FILENAME if you have sed available on your system. Oh...and it doesn't look like your code is using regexps for anything, despite the subject-line of your email :) I suspect they'll come in later for the "replace the tags" portion you mentioned, but that ain't in the code. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Regular expression worries
You are opening the same file twice, reading its contents line-by-line into memory, replacing "Document" with "Doc" *in memory*, never writing that to disk, and then discarding the line you just read into memory. If your file is short, you could read the entire thing into memory as one string using the .read() method of fh (your file object). Then, call .replace on the string, and then write to disk. If your file is long, then you want to do the replace line by line, writing as you go to a second file. You can later rename that file to the original file's name and delete the original. Also, you aren't using regular expressions at all. You do not therefore need the re module. CSUIDL PROGRAMMEr wrote: > folks > I am new to python, so excuse me if i am asking stupid questions. > > I have a txt file and here are some lines of it > > Document Keyword > Keyword Keyword > Keyword > Keyword Keyword Keyword > Keyword Keyword Keyword > I am writing a python program to replace the tags and word Document > with Doc. > > Here is my python program > > #! /usr/local/bin/python > > import sys > import string > import re > > def replace(): > filename='/root/Desktop/project/chatlog_20060819_110043.xml.txt' > try: > fh=open(filename,'r') > except: > print 'file not opened' > sys.exit(1) > for l in > open('/root/Desktop/project/chatlog_20060819_110043.xml.txt'): > > l=l.replace("Document", "DOC") > fh.close() > > if __name__=="__main__": > replace() > > But it does not replace Document with Doc in the txt file > > Is there anything wrong i am doing > > thanks -- http://mail.python.org/mailman/listinfo/python-list
Regular expression worries
folks I am new to python, so excuse me if i am asking stupid questions. I have a txt file and here are some lines of it Document Keyword Keyword Keyword Keyword Keyword Keyword Keyword Keyword Keyword Keywordhttp://mail.python.org/mailman/listinfo/python-list