Bugs item #1627096, was opened at 2007-01-03 17:06 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1627096&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 >Status: Closed >Resolution: Invalid Priority: 5 Private: No Submitted By: Pierre Imbaud (pmi) Assigned to: Nobody/Anonymous (nobody) Summary: xml.dom.minidom parse bug Initial Comment: xml.dom.minidom was unable to parse an xml file that came from an example provided by an official organism.(http://www.iptc.org/IPTC4XMP) The parsed file was somewhat hairy, but I have been able to reproduce the bug with a simplified version, attached. (ends with .xmp: its supposed to be an xmp file, the xmp standard being built on xml. Well, thats the short story). The offending part is the one that goes: xmpPLUS='....' it triggers an exception: ValueError: too many values to unpack, in _parse_ns_name. Some debugging showed an obvious mistake in the scanning of the name argument, that goes beyond the closing " ' ". I digged a little further thru a pdb session, but the bug seems to be located in c code. Thats the very first time I report a bug, chances are I provide too much or too little information... To whoever it may concern, here is the invoking code: from xml.dom import minidom ... class xmp(dict): def __init__(self, inStream): xmldoc = minidom.parse(inStream) .... x = xmp('/home/pierre/devt/port/IPTCCore-Full/x.xmp') traceback: /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xmpLib.py in __init__(self, inStream) 26 def __init__(self, inStream): 27 print minidom ---> 28 xmldoc = minidom.parse(inStream) 29 xmpmeta = xmldoc.childNodes[1] 30 rdf = xmpmeta.childNodes[1] /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/nxml/dom/minidom.py in parse(file, parser, bufsize) /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parse(file, namespaces) 922 fp = open(file, 'rb') 923 try: --> 924 result = builder.parseFile(fp) 925 finally: 926 fp.close() /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parseFile(self, file) 205 if not buffer: 206 break --> 207 parser.Parse(buffer, 0) 208 if first_buffer and self.document.documentElement: 209 self._setup_subset(buffer) /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in start_element_handler(self, name, attributes) 743 def start_element_handler(self, name, attributes): 744 if ' ' in name: --> 745 uri, localname, prefix, qname = _parse_ns_name(self, name) 746 else: 747 uri = EMPTY_NAMESPACE /home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in _parse_ns_name(builder, name) 125 localname = intern(localname, localname) 126 else: --> 127 uri, localname = parts 128 prefix = EMPTY_PREFIX 129 qname = localname = intern(localname, localname) ValueError: too many values to unpack The offending c statement: /usr/src/packages/BUILD/Python-2.4/Modules/pyexpat.c(582)StartElement() The returned 'name': (Pdb) name Out[5]: u'XMP Photographic Licensing Universal System (xmpPLUS, http://ns.adobe.com/xap/1.0/PLUS/) CreditLineReq xmpPLUS' Its obvious the scanning went beyond the attribute. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2007-01-04 12:18 Message: Logged In: YES user_id=21627 Originator: NO This is not a bug in Python, but a bug in the XML document. According to section 2.1 of http://www.w3.org/TR/2006/REC-xml-names-20060816/ an XML namespace must be an URI reference; according to RFC 3986, the string "XMP Photographic Licensing Universal System (xmpPLUS, http://ns.adobe.com/xap/1.0/PLUS/)" is not an URI reference, as it contains spaces. Closing this report as invalid. If you want to work around this bug, you can parse the file in non-namespace mode, using xml.dom.expatbuilder.parse("/tmp/x.xmp", namespaces=False) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1627096&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com