Hi Patrick, On Tue, Sep 7, 2010 at 10:35 PM, Patrick McClory wrote: > Hello, > > I'm working on a project which requires validation of xml documents against > .xsd schemas. We both create xml documents from scratch, and create xml docs > from char * buffers read from a socket. I've run into trouble validating the > docs created from buffers, even when the buffers are generated from a > document that already validated successfully. > > For example I have the following schema in a file called "example.xsd": > > <?xml version="1.0" encoding="UTF-8"?> > <!-- > A simple test schema > --> > <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" > targetNamespace="http://localhost/test_namespace" > xmlns="http://localhost/test_namespace"> > > <xs:complexType name="testType"> > <xs:sequence minOccurs="1" maxOccurs="1"> > <xs:element name="element1" type="xs:string"/> > <xs:element name="element2" type="xs:int"/> > </xs:sequence> > </xs:complexType> > > <xs:element name="testInstance" type="testType"/> > </xs:schema> > > The following code creates an xml doc from scratch, which validates. It then > dumps that doc into a buffer, reads that buffer into a new doc, and tries to > validate that, but the second validation fails: > (snip C code) > The output generated when this runs is: > > generated doc is valid > element element1: Schemas validity error : Element > '{http://localhost/test_namespace}element1': This element is not expected. > Expected is ( element1 ). > doc from buffer is invalid > > I dump both docs to output files (generated.xml and buffer.xml), and I > confirmed that they're the same on disk using diff. > > The problem seems to be that when libxml reads from the buffer it attaches > the parent namespace to the children (if it isn't specified), which later > causes validation to fail. Is this a common problem? Is there a standard > workaround for this? >
If you run "xmllint --schema example.xsd generated.xml", you'll get the same error. Same with buffer.xml too. In generated.xml, the default namespace of "http://localhost/test_namespace" applies to all elements, including <element1> and <element2>. However, in the xsd the targetNamespace applies only to the top-level <element name="testInstance"> (I think) So either the in-memory validation is incorrect or the writing and re-reading of the XML document is not a null operation. xmlDocDumpFormatMemory may be at fault. To validate, element1 and element2 would need a xmlns="" each. The following validates with examle.xsd: <testInstance xmlns="http://localhost/test_namespace"> <element1 xmlns="">foo</element1> <element2 xmlns="">1</element2> </testInstance> Running under the debugger: (gdb) p *root_node $1 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdca3e8 "testInstance", children = 0xdca468, last = 0xdca508, parent = 0xdca330, next = 0x0, prev = 0x0, doc = 0xdca330, ns = 0xdca400, ...} (gdb) p *root_node->children $2 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdca4a8 "element1", children = 0xdca4b8, last = 0xdca4b8, parent = 0xdca3a8, next = 0xdca508, prev = 0x0, doc = 0xdca330, ns = 0xdca448, ...} Note the different ns for the root and the first child. (gdb) p *new_doc ->children $5 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdcaa2b "testInstance", children = 0xdcaef0, last = 0xdcaf88, parent = 0xdcadf0, next = 0x0, prev = 0x0, doc = 0xdcadf0, ns = 0xdcaea8, ...} (gdb) p *new_doc ->children->children $6 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdcaa58 "element1", children = 0xdcaf30, last = 0xdcaf30, parent = 0xdcc9b8, next = 0xdcaf88, prev = 0x0, doc = 0xdcadf0, ns = 0xdcaea8, ...} This time, the ns member has the same value for the root element and its child. -- Life is complex, with real and imaginary parts. "Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds "People disagree with me. I just ignore them." -- Linus Torvalds _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
