In this case, it looks like the test files aren't really utf-8-sig.  That is, 
under CPython:
C:\Users\dfugate\Desktop>C:\Python27\python.exe
Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> import nt
>>> dir = nt.getcwd()
>>> i = 'a - 2 lines.txt'
>>> file = codecs.open(dir + "\\" + i, "r", "utf_8_sig")
>>> for line in file: print line
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-7: 
character maps to <undefined>

>>> file.close()
>>> i = 'a - 3 lines.txt'
>>> file = codecs.open(dir + "\\" + i, "r", "utf_8_sig")
>>> for line in file: print line
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-7: 
character maps to <undefined>
>>> file.close()

The bug here is that IronPython can process 'a - 2 lines.txt'.

Dave

From: users-boun...@lists.ironpython.com 
[mailto:users-boun...@lists.ironpython.com] On Behalf Of abdalla ramadan
Sent: Friday, July 09, 2010 11:46 AM
To: users@lists.ironpython.com
Subject: [IronPython] reading utf-8 files

Hello,

I am trying to read utf-8 files (written using notepad and have BOM) using the 
following code

file = codecs.open(dir+ '\\' + i,"r",'utf_8_sig')
for line in file:
    print "line"

I attached two files the a - 3 lines.txt file gives this exception and print 
"line" is never called not even once

Unhandled Exception: System.Text.EncoderFallbackException: failed to decode 
bytes at index 65

but the file a - 2 lines.txt is read without problems

I tried with several different texts but I could not find rule for a file that 
throws this exception. I tested other files with 3 lines that did not throw the 
exception.

Thanks very much for advance
_______________________________________________
Users mailing list
Users@lists.ironpython.com
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

Reply via email to