On Sun, Feb 28, 2010 at 6:03 PM, Bruce Dawson <j...@codemeta.com> wrote:
> Has anyone used xmlstarlet (a command-line xml parser) to get data from > content.xml (OpenOffice) files? > > I had not heard of it before, so thanks for pointing it out. (Note to general readers: on my Ubuntu system I had to create a symbolic link 'xml' to /usr/bin/xmlstarlet to use the 'xml' command referenced in the documentation.) > It seems to be complaining about Undefined namespace prefix, and I can't > seem to figure out what it wants. > I was able to get results by specifying more/all of the namespaces used in the document in question. For example: xml select -N :1.0' -N table='urn:oasis:names:tc:opendocument:xmlns:table:1.0' -N draw='urn:oasis:names:tc:opendocument:xmlns:drawing:1.0' -N fo='urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0' -N xlink=' http://www.w3.org/1999/xlink' -N dc='http://purl.org/dc/elements/1.1/' -N meta='urn:oasis:names:tc:opendocument:xmlns:meta:1.0' -N number='urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0' -N svg='urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0' -N chart='urn:oasis:names:tc:opendocument:xmlns:chart:1.0' -N dr3d='urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0' -N form='urn:oasis:names:tc:opendocument:xmlns:form:1.0' -N script='urn:oasis:names:tc:opendocument:xmlns:script:1.0' -N ooo=' http://openoffice.org/2004/office' -N ooow=' http://openoffice.org/2004/writer' -N oooc='http://openoffice.org/2004/calc' -N dom='http://www.w3.org/2001/xml-events' -N xforms=' http://www.w3.org/2002/xforms' -N xsd='http://www.w3.org/2001/XMLSchema' -N xsi='http://www.w3.org/2001/XMLSchema-instance' -N rpt=' http://openoffice.org/2005/report' -N of='urn:oasis:names:tc:opendocument:xmlns:of:1.2' -N rdfa=' http://docs.oasis-open.org/opendocument/meta/rdfa#' -N field='urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:field:1.0' -N formx='urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:form:1.0' --text --template --value-of office:document-content content.xml Or, xmlstarlet select -T -N office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" -N xlink=" http://www.w3.org/1999/xlink" -N dc="http://purl.org/dc/elements/1.1/" -N meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" -N ooo=" http://openoffice.org/2004/office" -t -v office:document-meta/office:meta/meta:generator meta.xml I found it was easiest to identify all the namespaces used in the document by using the "el" and "ed" commands: i.e. xml elements -v content.xml or xml edit -v content.xml The online reference in PDF was helpful as a cheatsheet http://xmlstar.sourceforge.net/doc/xmlstarlet.pdf ~ Greg
_______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/