On Fri, Feb 20, 2009 at 4:09 PM, Thomas A. Schmitz <thomas.schm...@uni-bonn.de> wrote: > > On Feb 19, 2009, at 3:10 PM, luigi scarso wrote: > >> see >> http://codespeak.net/lxml/tutorial.html#namespaces > > Luigi, > > thanks so much for your patient replies. I have now begun to play with > python's lxml. It offers a lot, maybe too much for a beginner. One advantage > for my immediate needs that I see is that it offers the possibility to use > Python's regular expressions and control structures, so this may make coding > easier to maintain and adapt that in the rather clumsy xslt syntax; it may > be a big help for the rather messy OpenOffice xml that I want to process.
also Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> URI_OFFICE = "urn:oasis:names:tc:opendocument:xmlns:office:1.0" URI_STYLE = "urn:oasis:names:tc:opendocument:xmlns:style:1.0" URI_TEXT = "urn:oasis:names:tc:opendocument:xmlns:text:1.0" URI_TABLE = "urn:oasis:names:tc:opendocument:xmlns:table:1.0" URI_DRAW = "urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" URI_FO = "urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" URI_XLINK = "http://www.w3.org/1999/xlink" URI_DC = "http://purl.org/dc/elements/1.1/" URI_META = "urn:oasis:names:tc:opendocument:xmlns:meta:1.0" URI_NUMBER = "urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" URI_PRESENTATION = "urn:oasis:names:tc:opendocument:xmlns:presentation:1.0" URI_SVG = "urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" URI_CHART = "urn:oasis:names:tc:opendocument:xmlns:chart:1.0" URI_DR3D = "urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" URI_MATH = "http://www.w3.org/1998/Math/MathML" URI_FORM = "urn:oasis:names:tc:opendocument:xmlns:form:1.0" URI_SCRIPT = "urn:oasis:names:tc:opendocument:xmlns:script:1.0" URI_OOO = "http://openoffice.org/2004/office" URI_OOOW = "http://openoffice.org/2004/writer" URI_OOOC = "http://openoffice.org/2004/calc" URI_DOM = "http://www.w3.org/2001/xml-events" URI_XFORMS = "http://www.w3.org/2002/xforms" URI_XSD = "http://www.w3.org/2001/XMLSchema" URI_XSI = "http://www.w3.org/2001/XMLSchema-instance" URI_FIELD = "urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:field:1.0" >>> NSMAP_OO = { "office" : URI_OFFICE, "style" : URI_STYLE, "text" : URI_TEXT, "table" : URI_TABLE, "draw" : URI_DRAW, "fo" : URI_FO, "xlink" : URI_XLINK, "dc" : URI_DC, "meta" : URI_META, "number" : URI_NUMBER, "presentation" : URI_PRESENTATION, "svg" : URI_SVG, "chart" : URI_CHART, "dr3d" : URI_DR3D, "math" : URI_MATH, "form" : URI_FORM, "script" : URI_SCRIPT, "ooo" : URI_OOO, "ooow" : URI_OOOW, "oooc" : URI_OOOC, "dom" : URI_DOM, "xforms" : URI_XFORMS, "xsd" : URI_XSD, "xsi" : URI_XSI, "field" : URI_FIELD, } >>> from lxml import etree >>> tree = etree.parse(file('t.xml')) >>> >>> foo = tree.getroot() >>> [child.tag for child in foo.iterdescendants(tag = '{%s}span'%URI_TEXT ) ] ['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span'] give a look at http://opendocumentfellowship.com/projects/odfpy too -- luigi ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________