Hi Daniel, Alexey, Alexey Neyman <sti...@att.net> writes:
> I think I know what is causing the issue. The code in > xmlXIncludeLoadDoc looks at the url argument to see if it is relative > path - to do so, it looks for slashes in the path. The problem is that > xmlXIncludeLoadNode() passes down URIs that are relative to the top- > level document, not to the most recent inclusion. Therefore, in the > example below the url in xmlXIncludeLoadDoc() is just '3.xml', not > '../3.xml' - and thus, the code wrongly considers it to be based in > the same directory as the current included file. Thanks for fixing this. Maybe this whole "check for a slash to tell if xml:base fixup is needed" logic is flawed, though? I'm using libxml2 2.9.1 and lxml 3.2.1 Given these example files (similar to your examples, Alexey), I get no xml:base fixup at all: ### sample files ################################################## # generate three example files mkdir test cd test cat >1.xml <<EOF <?xml version="1.0"?> <top xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include href="2.xml"/> </top> EOF cat >2.xml <<EOF <?xml version="1.0"?> <elem1 xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include href="3.xml"/> </elem1> EOF cat >3.xml <<EOF <?xml version="1.0"?> <elem2> <a fileref="x.svg"/> </elem2> EOF ### wrong output ################################################## # expect xml:base fixup. Get none :( xmllint --xinclude 1.xml <?xml version="1.0"?> <top xmlns:xi="http://www.w3.org/2001/XInclude"> <elem1 xmlns:xi="http://www.w3.org/2001/XInclude"> <elem2> <a fileref="x.svg"/> </elem2> </elem1> </top> ################################################################### The xml:base is not just the directory, it also contains the file name, right? The whole XInclude test suite behaves like that, see below. So it _should_ look like this, shouldn't it? This is what I get with the attached patch to libxml: ### correct output ################################################ xmllint --xinclude 1.xml <?xml version="1.0"?> <top xmlns:xi="http://www.w3.org/2001/XInclude"> <elem1 xmlns:xi="http://www.w3.org/2001/XInclude" xml:base="2.xml"> <elem2 xml:base="3.xml"> <a fileref="x.svg"/> </elem2> </elem1> </top> ################################################################### The XInclude test suite agrees, when run with the attached script, like this. ################################################################### cvs -d:pserver:anonym...@dev.w3.org:/sources/public \ co 2001/XInclude-Test-Suite XInclude-Test-Suite cd XInclude-Test-Suite python3 PATH-TO/run-tests-with-lxml.py ################################################################### This gets about 15 less failures when run with the patch below, and afaict from a review with/without patch, there is no additional ones. So it should be an improvement :) S.
Do xml:base fixup for file name changes in the same directory, too. The "if it contains no slash, it needs no fixup" logic breaks the XInclude test suite. Index: libxml2-2.9.1/xinclude.c =================================================================== --- libxml2-2.9.1.orig/xinclude.c +++ libxml2-2.9.1/xinclude.c @@ -1685,7 +1685,7 @@ loaded: #endif /* - * Do the xml:base fixup if needed + * Do the xml:base fixup as needed */ if ((doc != NULL) && (URL != NULL) && (xmlStrchr(URL, (xmlChar) '/')) && (!(ctxt->parseFlags & XML_PARSE_NOBASEFIX)) && @@ -1695,28 +1695,26 @@ loaded: xmlChar *curBase; /* - * The base is only adjusted if "necessary", i.e. if the xinclude node - * has a base specified, or the URL is relative + * The xml:base is adjusted as necessary. Possibly the + * xinclude node has a base specified? */ base = xmlGetNsProp(ctxt->incTab[nr]->ref, BAD_CAST "base", XML_XML_NAMESPACE); if (base == NULL) { /* - * No xml:base on the xinclude node, so we check whether the - * URI base is different than (relative to) the context base + * No xml:base on the xinclude node. Compute the base + * from the URL of the included document, if possible + * relative to the context base. See + * uri.c:xmlBuildRelativeURI for the relative/absolute + * magic. */ curBase = xmlBuildRelativeURI(URL, ctxt->base); if (curBase == NULL) { /* Error return */ xmlXIncludeErr(ctxt, ctxt->incTab[nr]->ref, XML_XINCLUDE_HREF_URI, "trying to build relative URI from %s\n", URL); - } else { - /* If the URI doesn't contain a slash, it's not relative */ - if (!xmlStrchr(curBase, (xmlChar) '/')) - xmlFree(curBase); - else - base = curBase; } + base = curBase; } if (base != NULL) { /* Adjustment may be needed */ node = ctxt->incTab[nr]->inc;
#!/usr/bin/env python3 # (C) 2014 Susanne Oberhauser-Hirschoff <f...@suse.com> # The MIT license applies http://opensource.org/licenses/MIT """ # Run the XInclude test suite through lxml: # get the test suite cvs -d:pserver:anonym...@dev.w3.org:/sources/public \ co 2001/XInclude-Test-Suite XInclude-Test-Suite cd XInclude-Test-Suite # run this script python3 PATH-TO/run-tests-with-lxml.py """ from lxml import etree, objectify tests = objectify.parse('testdescr.xml').getroot() feature2xmllint_option = { 'xpointer-scheme': '', 'unexpanded-entities': None, 'unparsed-entities': None, 'lang-fixup': None, } class TC: pass tcs = list() for suite in tests.testcases: basedir = suite.get('basedir') creator = suite.get('creator') for case in suite.testcase: tc = TC() tc.basedir = basedir tc.creator = creator tc.id = case.get('id') tc.file = case.get('href') # success, error or optional tc.type = case.get('type') if tc.type == 'error': tc.result_file = None else: tc.result_file = case.output required_features = case.get('features') if required_features is None: tc.required_features = list() else: tc.required_features = required_features.split() tcs.append(tc) for tc in tcs: if tc.required_features is None: tc.xmllint_options = [''] else: tc.xmllint_options = tuple(feature2xmllint_option[f] for f in tc.required_features) if None in tc.xmllint_options: tc.unhandled_features = tuple( filter( lambda x: None is feature2xmllint_option[x], tc.required_features )) else: tc.unhandled_features = None def xinclude_expand(tc): filename = "{tc.basedir}/{tc.file}".format(tc=tc) got = etree.parse(filename) got.xinclude() result = ['<?xml version="1.0"?>'] result.extend( etree.tostring(got, encoding=str).splitlines()) return filename, result import difflib for tc in tcs: if tc.unhandled_features != None: print("untested: {tc.creator}-{tc.id}: can't handle options {tc.unhandled_features}\n".format(tc=tc)) continue try: tofile, got = xinclude_expand(tc) fromfile = "{tc.basedir}/{tc.result_file}".format(tc=tc) with open(fromfile) as f: expected = f.read().splitlines() diff = difflib.unified_diff(expected, got, fromfile=fromfile, tofile="lxml.etree.parse( {} ).xinclude().tostring()".format(tofile), lineterm='') diff = list(diff) if len(diff) == 0: print("pass: {tc.creator}-{tc.id}".format(tc=tc)) else: print("###{:#<64}".format(" diff: {tc.creator}-{tc.id} ".format(tc=tc))) for line in diff: print(line) print('###################################################################') except Exception as e: if tc.type == 'error': print("pass: {tc.creator}-{tc.id}: expected error {e}".format(tc=tc,e=e)) else: print("fail: {tc.creator}-{tc.id}: unexpected error {e}".format(tc=tc,e=e))
-- Susanne Oberhauser SUSE LINUX Products GmbH +49-911-74053-574 Maxfeldstraße 5 Processes and Infrastructure 90409 Nürnberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml