UTF-8 Encoding Error

subhabangalore Thu, 22 Dec 2016 22:43:42 -0800

I am getting the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15: invalid 
start byte


as I try to read some files through TaggedCorpusReader. TaggedCorpusReader is a 
module
of NLTK.
My files are saved in ANSI format in MS-Windows default. 
I am using Python2.7 on MS-Windows 7. 

I have tried the following options till now, 
string.encode('utf-8').strip()
unicode(string)
unicode(str, errors='replace')
unicode(str, errors='ignore')
string.decode('cp1252')

But nothing is of much help.

If any one may kindly suggest.

I am trying if you may see.
-- 
https://mail.python.org/mailman/listinfo/python-list

UTF-8 Encoding Error

Reply via email to