Re: Character encoding & the copyright symbol

Philip Semanchuk Thu, 06 Aug 2009 10:03:46 -0700


On Aug 6, 2009, at 12:41 PM, Robert Dailey wrote:

On Aug 6, 11:31 am, "Richard Brodie" <[email protected]> wrote:

"Robert Dailey" <[email protected]> wrote in message

news:29ab0981-b95d-4435-91bd-a7a520419...@b15g2000yqd.googlegroups.com...

UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
position 1650: character maps to <undefined>

The file is defined as ASCII.


That's the problem: ASCII is a seven bit code. What you have is
actually ISO-8859-1 (or possibly Windows-1252).

The different ISO-8859-n variants assign various characters to
to '\xa9'. Rather than being Western-European centric and assuming
ISO-8859-1 by default, Python throws an error when you stray
outside of strict ASCII.


Thanks for the help guys. Sorry I left out code, I wasn't sure at the
time if it would be helpful. Below is my code:


#========================================================
def GetFileContentsAsString( file ):
  f = open( file, mode='r', encoding='cp1252' )
  contents = f.read()
  f.close()
  return contents

#========================================================
def ReplaceVersion( file, version, regExps ):
  #match = regExps[0].search( 'FILEVERSION 1,45332,2100,32,' )
  #print( match.group() )
  text = GetFileContentsAsString( file )
  print( text )


As you can see, I am trying to load the file with encoding 'cp1252'
which, according to the python 3.1 docs, translates to windows-1252. I
also tried 'latin_1', which translates to ISO-8859-1, but this did not
work either. Am I doing something else wrong?

Are you getting the error when you read the file or when youprint(text)?

As a side note, you should probably use something other than "file"for the parameter name in GetFileContentsAsString() since file() is aPython function.





--
http://mail.python.org/mailman/listinfo/python-list

Re: Character encoding & the copyright symbol

Reply via email to