Rickard Lindberg, 09.02.2011 09:32:
On Tue, Feb 8, 2011 at 5:41 PM, Chris Rebert<c...@rebertia.com>  wrote:
Here is a bash script to reproduce my error:

Including the error message and traceback is still helpful, for future
reference.

Thanks for pointing it out.

    #!/bin/sh

    cat>  å.timeline<<EOF
<snip>
    EOF

    python<<EOF
    # encoding: utf-8
    from xml.sax import parse
    from xml.sax.handler import ContentHandler
    parse(u"å.timeline", ContentHandler())
    EOF

If I instead do

    parse(u"å.timeline".encode("utf-8"), ContentHandler())

the script runs without errors.

Is this a bug or expected behavior?

Bug; open() figures out the filesystem encoding just fine.
Bug tracker to report the issue to: http://bugs.python.org/

Workaround:
parse(open(u"å.timeline", 'r'), ContentHandler())

When I tried your workaround, I still got this error:

Traceback (most recent call last):
   File "<stdin>", line 4, in<module>
   File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/__init__.py",
line 31, in parse
     parser.parse(filename_or_stream)
   File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py",
line 109, in parse
     xmlreader.IncrementalParser.parse(self, source)
   File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/xmlreader.py",
line 119, in parse
     self.prepareParser(source)
   File "/usr/lib64/python2.7/site-packages/_xmlplus/sax/expatreader.py",
line 121, in prepareParser
     self._parser.SetBase(source.getSystemId())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in
position 0: ordinal not in range(128)

The open(..) part works fine, but there still seems to be a problem inside the
sax parser.

Did you read my reply?

Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to