Re: Regular expression worries

2006-10-11 Thread Bruno Desthuilliers
CSUIDL PROGRAMMEr wrote:
> folks
> I am new to python, so excuse me if i am asking stupid questions.

>From what I see, you seem to be new to programming in general !-)

> I have a txt file  and here are some lines of it
> 
> Document Keyword
> Keyword Keyword
> Keyword   > Keyword Keyword Keyword
> Keyword Keyword Keyword 
> I am writing a python program to replace the tags and word  Document
> with Doc.
> 
> Here is my python program
> 
> #! /usr/local/bin/python
> 
> import sys
> import string
> import re
> 
> def replace():
>   filename='/root/Desktop/project/chatlog_20060819_110043.xml.txt'
>   try:
> fh=open(filename,'r')
>   except:
> print 'file not opened'
> sys.exit(1)

You open your file a first time, and bind the reference to the file
object to fh.

>   for  l in
> open('/root/Desktop/project/chatlog_20060819_110043.xml.txt'):

And then you open the file a second time...

>   l=l.replace("Document", "DOC")

This modifies the string referenced by l (talk about a bad name) and
rebind to the same name

>   fh.close()

Then you close fh... and discard the modifications to l.

> if __name__=="__main__":
>   replace()
> 
> But it does not replace Document with Doc in  the txt file

Why should it ? You didn't asked for it !-)

> Is there anything wrong i am doing

Yes.

The canonical way to modify a text file is to read from original / do
transformations / *write modifications to a tmp file* / replace the
original with the tmp file.


-- 
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in '[EMAIL PROTECTED]'.split('@')])"
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular expression worries

2006-10-11 Thread Tim Chase
>   for  l in
> open('/root/Desktop/project/chatlog_20060819_110043.xml.txt'):
> 
>   l=l.replace("Document", "DOC")
>   fh.close()
> 
> But it does not replace Document with Doc in  the txt file

In addition to closing the file handle for the loop *within* the 
loop, you're changing "l" (side note: a bad choice of names, as 
in most fonts, it's difficult to visually discern from the number 
"1"), but you're not writing it back out any place.  One would do 
something like

outfile = open('out.txt', 'w')
infile = open(filename)
for line in infile:
outfile.write(line.replace("Document", "DOC"))
outfile.close()
infile.close()

You could even let garbage collection take care of the file 
handle for you:


outfile = open('out.txt', 'w')
for line in open(filename):
outfile.write(line.replace("Document", "DOC"))
outfile.close()


If needed, you can then move the 'out.txt' overtop of the 
original file.

Or, you could just use

sed 's/Document/DOC/g' $FILENAME > out.txt

or with an accepting version, do it in-place with

sed -i 's/Document/DOC/g' $FILENAME

if you have sed available on your system.

Oh...and it doesn't look like your code is using regexps for 
anything, despite the subject-line of your email :)  I suspect 
they'll come in later for the "replace the tags" portion you 
mentioned, but that ain't in the code.

-tkc





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular expression worries

2006-10-11 Thread johnzenger
You are opening the same file twice, reading its contents line-by-line
into memory, replacing "Document" with "Doc" *in memory*, never writing
that to disk, and then discarding the line you just read into memory.

If your file is short, you could read the entire thing into memory as
one string using the .read() method of fh (your file object).  Then,
call .replace on the string, and then write to disk.

If your file is long, then you want to do the replace line by line,
writing as you go to a second file.  You can later rename that file to
the original file's name and delete the original.

Also, you aren't using regular expressions at all.  You do not
therefore need the re module.

CSUIDL PROGRAMMEr wrote:
> folks
> I am new to python, so excuse me if i am asking stupid questions.
>
> I have a txt file  and here are some lines of it
>
> Document Keyword
> Keyword Keyword
> Keyword   > Keyword Keyword Keyword
> Keyword Keyword Keyword
> I am writing a python program to replace the tags and word  Document
> with Doc.
>
> Here is my python program
>
> #! /usr/local/bin/python
>
> import sys
> import string
> import re
>
> def replace():
>   filename='/root/Desktop/project/chatlog_20060819_110043.xml.txt'
>   try:
> fh=open(filename,'r')
>   except:
> print 'file not opened'
> sys.exit(1)
>   for  l in
> open('/root/Desktop/project/chatlog_20060819_110043.xml.txt'):
>
>   l=l.replace("Document", "DOC")
>   fh.close()
>
> if __name__=="__main__":
>   replace()
>
> But it does not replace Document with Doc in  the txt file
> 
> Is there anything wrong i am doing
> 
> thanks

-- 
http://mail.python.org/mailman/listinfo/python-list


Regular expression worries

2006-10-11 Thread CSUIDL PROGRAMMEr
folks
I am new to python, so excuse me if i am asking stupid questions.

I have a txt file  and here are some lines of it

Document Keyword
Keyword Keyword
Keyword Keyword Keyword Keyword
Keyword Keyword Keywordhttp://mail.python.org/mailman/listinfo/python-list