Re: unicode text file

Mark Tolonen Sun, 27 Sep 2009 07:41:15 -0700

"Junaid" <junu...@gmail.com> wrote in messagenews:0267bef9-9548-4c43-bcdf-b624350c8...@p23g2000vbl.googlegroups.com...

I want to do replacements in a utf-8 text file. example


f=open("test.txt","r") #this file is uft-8 encoded
raw = f.read()
txt = raw.decode("utf-8")


You can use the codecs module to open and decode the file in one step


txt.replace{'English', ur'ഇംഗ്ലീഷ്') #replacing raw unicode string,
but not working

The replace method returns the altered string. It does not modify it inplace. You also should use Unicode strings for both the arguments (althoughit doesn't matter in this case). Using a raw Unicode string is alsounnecessary in this case.


   txt = txt.replace(u'English', u'ഇംഗ്ലീഷ്')

f.write(txt)

You opened the file for writing. You'll need to close the file and reopenit for writing.

f.close()
f.flush()


Flush isn't required.  close() will flush.

Also to have text like ഇംഗ്ലീഷ് in a file you'll need to declare theencoding of the file at the top and be sure to actually save the file in theencoding.


In summary:

   # coding: utf-8
   import codecs
   f = codecs.open('test.txt','r','utf-8')
   txt = f.read()
   txt = txt.replace(u'English', u'ഇംഗ്ലീഷ്')
   f.close()
   f = codecs.open('test.txt','w','utf-8')
   f.write(txt)
   f.close()

-Mark


--
http://mail.python.org/mailman/listinfo/python-list

Re: unicode text file

Reply via email to