Matt Funk, 21.02.2011 18:30:
I want to create a set of xml input files to my code that look as follows:
<?xml version="1.0" encoding="UTF-8"?>

<!-- Settings for the algorithm to be performed
                                          -->
<Algorithm>

     <!-- The algorithm type.
                                     -->
     <!-- The supported options are:
                                     -->
     <!--     - Alg0
                                         -->
     <!--     - Alg1
                                         -->
     <Type>Alg1</Type>

     <!-- the location/path of the input file for this algorithm
                                         -->
     <path>./Alg1.in</path>

</Algorithm>


<!-- Relevant information during the processing will be written to a
logfile                                 -->
<Logfile>

     <!-- the location/path of the logfile (i.e. where to put the
logfile)                                    -->
     <path>c:\tmp</path>

     <!-- verbosity level (i.e. how much to print)
                                     -->
     <!-- The supported options are:
                                     -->
     <!--     - 0    (nothing printed)
                                         -->
     <!--     - 1    (print on error)
                                         -->
     <verbosity>1</verbosity>

</Logfile>

That's not XML. XML documents have exactly one root element, i.e. you need an enclosing element around these two tags.


So there are comments, whitespace etc ... in it.
I would like to be able to put everything into some sort of structure

Including the comments or without them? Note that ElementTree will ignore comments.


such that i can access it as:
structure['Algorithm']['Type'] == Alg1

Have a look at lxml.objectify. It allows you to write

    alg_type = root.Algorithm.Type.text

and a couple of other niceties.

http://lxml.de/objectify.html


I was wondering if there is something out there that does this.
I found and tried a few things:
1) http://code.activestate.com/recipes/534109-xml-to-python-data-structure/
It simply doesn't work. I get the following error:
raise exception
xml.sax._exceptions.SAXParseException:<unknown>:1:2: not well-formed
(invalid token)

"not well formed" == "not XML".


But i removed everything from the file except:<?xml version="1.0"
encoding="UTF-8"?>
and i still got the error.

That's not XML, either.


Anyway, i looked at ElementTree, but that error out with:
xml.parsers.expat.ExpatError: junk after document element: line 19, column 0

In any case, ElementTree is preferable over a SAX based solution, both for performance and maintainability reasons.

Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to