Dear all,
Hope this email finds you well!!
I am trying to obtain multiple error messages from validating a test xml file,
which I know in advance it has two or more different errors. However, when
attempting to read lxml's log as suggested in the docs, I just get only one
(the one which fails first). I'm a bit 'rusty' in python (no pun intended) but
the few comments I find around the Web (e.g. Stack Overflow, etc.), even though
not super explicit, do suggest it is possible.... am I doing anything wrong, or
did I miss anything? (Chances are I did!!)
Here is a code snippet (trying to achieve that in Databricks/Spark, lxml
version is 5.2.2)
from lxml import etree
def validate_xml(xml_file, xsd_file):
# Parse the XML and XSD files
xml_doc = etree.parse(xml_file)
with open(xsd_file, 'r') as f:
xsd_doc = etree.XML(bytes(f.read(), 'utf-8'))
# Create an XMLSchema object
schema = etree.XMLSchema(xsd_doc)
err_log = ""
# Validate the XML document against the schema
#is_valid = schema.validate(xml_doc)
try:
schema.assertValid(xml_doc)
except etree.DocumentInvalid as err:
err_log = str([error for error in schema.error_log])
#err_log = str(err)
return err_log or "Valid"
I know error_log is not iterable, but still, I was hoping I would find all the
error messages by printing the entire log....what am I missing? isn't this
possible?? (is this a SAX type parser only??)
Also, looks like I cannot attach files in here?? Anyways, you can find example
xml & xsd used at the end of this message.
For the kind souls out there: Any hint/ suggestion/ example would be much
appreciated!!!
Thank you so much in advance!!!
Claudio P.
-------------------------------------------------
--- xml & xsd follows:
---Schema---
<?xml version="1.0" encoding="utf-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="BusinessCard">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Name" type="xsd:string"/>
<xsd:element name="phone" maxOccurs="unbounded">
<xsd:complexType mixed="true">
<xsd:attribute name="type" use="required">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="mobile"/>
<xsd:enumeration value="fax"/>
<xsd:enumeration value="work"/>
<xsd:enumeration value="home"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:complexType>
</xsd:element>
<xsd:element name="email" type="xsd:string" minOccurs="0" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
----------------------------------------------------------
---xml---
<?xml version="1.0"?>
<BusinessCard>
<phone type="mobil">(415) 555-4567</phone>
<phone type="work">(800) 555-9876</phone>
<phone type="fax">(510) 555-1234</phone>
<mail>[email protected]</mail>
</BusinessCard>
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]