Per Bernhardsson wrote:
Thanks for your help, it has finally started to validate correctly. There
are still some minor problems though.

1) Content validation is performed after the call to the overriden
'characters' method. I assumed the validation would be before you are sent
the content so that you don't have to check the validity twice (once using
code and once using schema). My current workaround is to parse twice. Once
with an empty parser which just validates and then the real parser which
processes all data as well. This works for me as I the amount of data isn't
large, but I want to have a real solution.

Validation is done in the endElement() callback, when the element has been completely closed. Keep in mind that characters() can be called multiple times with partial content, and you should not rely on it having the full value (this can occur when an entity reference is present, or if the string is longer than the internal buffer used by Xerces). Hence, your implementation of characters should just collect the data, and process it in the endElement callback.
2) During validation the input data is reformatted for some reason,
linefeeds are replaced with spaces for example. Is there any way to stop
this behaviour?


If the reformatting occurs only when validating, it could be that the data type has the facet whitespace set to "replace" or "collapse" (instead of "preserve")

Alberto

 / Per

2008/2/26, Alberto Massari <[EMAIL PROTECTED]>:
Per Bernhardsson wrote:
I have finally managed to find the time to answer you. :)

This doesn't work for me for a very simple reason; the EntityResolver is
never called.

To explain a bit better:

I would like to validate the following XML stream:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <intelem>1</intelem>
    <stringelem>string</stringelem>
</root>

using the following schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";
elementFormDefault="qualified" attributeFormDefault="unqualified">
    <xs:element name="root">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="intelem" type="xs:int"/>
                <xs:element name="stringelem" type="xs:string"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

As you can see there is no reference in the XML to the XSD, neither do I
want there to be a reference anywhere as the source of XML stream is
completely untrusted.

Is what I'm looking for even possible to do with Xerces?

Yes; preload the schema using loadGrammar(.., true) to have it cached,
set the "use cached grammars" flag, set validation to "always" and your
XML will be validated


Alberto


 / Per

2008/2/25, Alberto Massari <[EMAIL PROTECTED]>:

Per Bernhardsson wrote:

I'm trying to parse and validate an XML stream using an in memory copy

of a

schema. As I can't trust the source of the XML stream I need to force
validation to the correct schema. It's not acceptable solution to
create
a

file containing the schema. Is it all possible?



Register an entity resolver handler to intercept the "load schema"
request, and provide a MemBufInputSource with your buffer. See the
Redirect sample for an example on how to do it.


Alberto







Reply via email to