Package: python3-lxml
Version: 4.6.3-1
Severity: important

Dear Maintainer,

The Relax NG validator does not correctly detect the errors in the XML file.
The simple set of files reconstructing the problem is attached in archive
"bug_demo.tar.gz".
The agwb.rng defines the XML schema. The example1.xml contains the
conforming
xml file.
The example1a.xml file has an error - field "B2" in creg "X2" does not
contain
required "width" attribute.
The example1b.xml has another error - the attribute "name" in block SYS1
is written as "nafme".

Validating the erroneous files, however, gives incorrect results.
The tests are performed both with Python lxml module and with xmllint
utility.
The errors are similar, so probably the error is located in the lxml
library.

$ ./test0.sh example1a.xml
[...]
example1a.xml:11: element field: Relax-NG validity error : Element field
failed to validate attributes
example1a.xml:10: element field: Relax-NG validity error : Element creg
has extra content: field
Relax-NG validity error : Extra element creg in interleave
example1a.xml:9: element creg: Relax-NG validity error : Element block
failed to validate content
Relax-NG validity error : Extra element block in interleave
example1a.xml:3: element block: Relax-NG validity error : Element sysdef
failed to validate content
example1a.xml fails to validate

$ ./test0.py example1a.xml False
example1a.xml:11:0:ERROR:RELAXNGV:RELAXNG_ERR_ATTRVALID: Element field
failed to validate attributes
example1a.xml:10:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element creg
has extra content: field
<string>:0:0:ERROR:RELAXNGV:RELAXNG_ERR_INTEREXTRA: Extra element creg
in interleave
example1a.xml:9:0:ERROR:RELAXNGV:RELAXNG_ERR_CONTENTVALID: Element block
failed to validate content
<string>:0:0:ERROR:RELAXNGV:RELAXNG_ERR_INTEREXTRA: Extra element block
in interleave
example1a.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_CONTENTVALID: Element
sysdef failed to validate content

The error message lacks specifity. There should be a detailed
information about the lacking field.
In case of example1b.xml the situation is even worse:

$ ./test0.sh example1b.xml example1b.xml:3: element block: Relax-NG
validity error : Expecting an element sreg, got nothing
Relax-NG validity error : Extra element block in interleave
example1b.xml:3: element block: Relax-NG validity error : Element sysdef
failed to validate content
example1b.xml fails to validate

./test0.py example1b.xml False
example1b.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_NOELEM: Expecting an
element sreg, got nothing
<string>:0:0:ERROR:RELAXNGV:RELAXNG_ERR_INTEREXTRA: Extra element block
in interleave
example1b.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_CONTENTVALID: Element
sysdef failed to validate content

There is absolutely no reason to expect "sreg" element in a "sysdef". It
is against the defined schema.

However, if I convert the schema to the DTD:

$ trang -I rng -O dtd agwb.rng agwb.dtd
then the further validations wit DTD give the correct results for both
erroneous files:

$ ./test1.sh example1a.xml [...]
example1a.xml:11: element field: validity error : Element field does not
carry attribute width
Document example1a.xml does not validate against agwb.dtd

$ ./test1.py example1a.xml False
example1a.xml:11:0:ERROR:VALID:DTD_MISSING_ATTRIBUTE: Element field does
not carry attribute width

$ ./test1.sh example1b.xml [...]
example1b.xml:3: element block: validity error : Element block does not
carry attribute name
example1b.xml:3: element block: validity error : No declaration for
attribute nafme of element block
Document example1b.xml does not validate against agwb.dtd

$ ./test1.py example1b.xml False
example1b.xml:3:0:ERROR:VALID:DTD_MISSING_ATTRIBUTE: Element block does
not carry attribute name
example1b.xml:3:0:ERROR:VALID:DTD_UNKNOWN_ATTRIBUTE: No declaration for
attribute nafme of element block

The above reports precisely describe errors in the validated files.
Therefore the problem is rather not in the Relax NG scheme definition
but in the validator itself.

-- System Information:
Debian Release: bullseye/sid
APT prefers testing
APT policy: (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-5-amd64 (SMP w/12 CPU threads)
Locale: LANG=pl_PL.UTF-8, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8), LANGUAGE
not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages python3-lxml depends on:
ii libc6 2.31-10
ii libxml2 2.9.10+dfsg-6.3+b1
ii libxslt1.1 1.1.34-4
ii python3 3.9.2-2

Versions of packages python3-lxml recommends:
ii python3-bs4 4.9.3-1
ii python3-html5lib 1.1-3

Versions of packages python3-lxml suggests:
pn python-lxml-doc <none>
pn python3-lxml-dbg <none>

-- no debconf information

Reply via email to