Jaya, seems like you have done some tremendous amount work.

Where is the code be available. I'd like to see the things, as it seems
really interesting. Can u please point me to that ?

Thanks,
Chinthaka

> -----Original Message-----
> From: jayachandra [mailto:[EMAIL PROTECTED]
> Sent: Monday, April 25, 2005 7:26 PM
> To: [email protected]
> Subject: [Axis2] [Update] XMLConformace Testing Report.
> 
> Hi all,
> Total file count in W3C XMLSuite :2634 (this includes, valid, invalidand
> illformed xmls too) Of them, valid ones                    :960 (i.e.
> excluding invalidand illformed xmls. However this includes XMLs of both
> versions 1.0and 1.1)
> Of them, valid XML1.0 ones         :832 (i.e excluding xmls from
> 1.1version folders. Since the MXParser we have beneath is only
> 1.0compliant)
> On this final set, when OM is tested as is. 335 files got parsedproperly,
> and 309 files had the serialized XML matching the input file(comparison
> test). I've implemented OMComment and OMPI and did minimalistic
> OMDTD(without validation etc.) support. And with those changes the
> parsingrate increased to 735 and comparison success reached 567.
> The parsing failures found can be attributed to one or more of
> thefollowing observations I could make. This is not an exhaustive
> listthough.
> 1. For files where XML declaration line has a mention of
> 'standalone'attribute prior to 'encoding' attribute, underlying MXParser
> threw anexception with a message reading something like "Expected 'e'
> inencoding and not 's' ". Alek! Is this a known issue with STAX. What
> doyou think?
> 2. For files in which DTD declaration has right square bracket (']')as a
> literal value of some entity, MXParser is treating it as end ofDTD
> declaration.
> 3. Some xmls having multi byte characters (UK currency pound signamongst
> others) are failing to get parsed with typical exceptionmessages like only
> whitespace content allowed before start tag and not\ufffd. I have passed a
> "UTF-8" aware reader to the builder, do I needto use something else here?
> 4. Apart from these because I couldn't implement the complete DTD infoset
> implementation, some more files are failing to get parsed.
> Regarding the comparison, some of the observed reasons of failures are�
> 1. Many SYSTEM identifiers in DTD declarations used a relativereference
> and so far we don't have considered 'baseURI' property (doesSTAX parser
> provide one?) for any of the elements and hence the XMLcomparator
> (xmlunit) couldn't resolve the system identifiers therebyleading to a
> mismatch between the serialized xml and the originalinput form.2. Also
> since the DTD support is na�ve, the presentation of data iscompletely
> ignored thereby leading to scenarios like, serializing as#PCDATA when DTD
> says CDATA. This also lead to significant comparisonfailures.
> ThanksJaya
> ---- Jaya



Reply via email to