Package: python-lxml
Version: 1.0.1-2
Severity: normal
Hello,
Here is a transcript of a session using python-lxml, the bug is quite
obvious:
In [1]:import lxml.etree
In [2]:doctree = lxml.etree.parse(file('/tmp/test.xml'))
In [3]:for tag in doctree.xpath('/tags/*'):
.3.:print tag.attrib['tag'], tag.attrib['count']
.3.:
... lot of output
In [4]:toto = file('/tmp/test.xml').read()
In [5]:import StringIO
In [6]:lxml.etree.parse(StringIO.StringIO(toto))
---
exceptions.AssertionErrorTraceback (most recent
call last)
/home/nicoe/projets/gnomolicious/src/ipython console
/home/nicoe/projets/gnomolicious/src/etree.pyx in etree.parse()
/home/nicoe/projets/gnomolicious/src/parser.pxi in etree._parseDocument()
/home/nicoe/projets/gnomolicious/src/parser.pxi in etree._parseMemoryDocument()
/home/nicoe/projets/gnomolicious/src/apihelpers.pxi in etree._utf8()
AssertionError: All strings must be Unicode or ASCII
/home/nicoe/projets/gnomolicious/src/apihelpers.pxi(332)etree._utf8()
ipdb q
Obviously there should not be any problem when parsing a file through
the StringIO interface if there is no problem parsing the same file
through the file interface.
Since the same beahavior happens with cStringIO I suppose this bug is
related to python-lxml.
-- System Information:
Debian Release: testing/unstable
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.16-2-k7
Locale: LANG=fr_BE.UTF-8, LC_CTYPE=fr_BE.UTF-8 (charmap=UTF-8)
Versions of packages python-lxml depends on:
ii libc6 2.3.6-15 GNU C Library: Shared libraries
ii libxml22.6.26.dfsg-1 GNOME XML library
ii libxslt1.1 1.1.17-1 XSLT processing library - runtime
ii python 2.3.5-10 An interactive high-level object-o
ii python-central 0.5.0 register and build utility for Pyt
ii zlib1g 1:1.2.3-11compression library - runtime
python-lxml recommends no packages.
-- no debconf information
test.xml
Description: application/xml