Jaya, seems like you have done some tremendous amount work. Where is the code be available. I'd like to see the things, as it seems really interesting. Can u please point me to that ?
Thanks, Chinthaka > -----Original Message----- > From: jayachandra [mailto:[EMAIL PROTECTED] > Sent: Monday, April 25, 2005 7:26 PM > To: [email protected] > Subject: [Axis2] [Update] XMLConformace Testing Report. > > Hi all, > Total file count in W3C XMLSuite :2634 (this includes, valid, invalidand > illformed xmls too) Of them, valid ones :960 (i.e. > excluding invalidand illformed xmls. However this includes XMLs of both > versions 1.0and 1.1) > Of them, valid XML1.0 ones :832 (i.e excluding xmls from > 1.1version folders. Since the MXParser we have beneath is only > 1.0compliant) > On this final set, when OM is tested as is. 335 files got parsedproperly, > and 309 files had the serialized XML matching the input file(comparison > test). I've implemented OMComment and OMPI and did minimalistic > OMDTD(without validation etc.) support. And with those changes the > parsingrate increased to 735 and comparison success reached 567. > The parsing failures found can be attributed to one or more of > thefollowing observations I could make. This is not an exhaustive > listthough. > 1. For files where XML declaration line has a mention of > 'standalone'attribute prior to 'encoding' attribute, underlying MXParser > threw anexception with a message reading something like "Expected 'e' > inencoding and not 's' ". Alek! Is this a known issue with STAX. What > doyou think? > 2. For files in which DTD declaration has right square bracket (']')as a > literal value of some entity, MXParser is treating it as end ofDTD > declaration. > 3. Some xmls having multi byte characters (UK currency pound signamongst > others) are failing to get parsed with typical exceptionmessages like only > whitespace content allowed before start tag and not\ufffd. I have passed a > "UTF-8" aware reader to the builder, do I needto use something else here? > 4. Apart from these because I couldn't implement the complete DTD infoset > implementation, some more files are failing to get parsed. > Regarding the comparison, some of the observed reasons of failures are� > 1. Many SYSTEM identifiers in DTD declarations used a relativereference > and so far we don't have considered 'baseURI' property (doesSTAX parser > provide one?) for any of the elements and hence the XMLcomparator > (xmlunit) couldn't resolve the system identifiers therebyleading to a > mismatch between the serialized xml and the originalinput form.2. Also > since the DTD support is na�ve, the presentation of data iscompletely > ignored thereby leading to scenarios like, serializing as#PCDATA when DTD > says CDATA. This also lead to significant comparisonfailures. > ThanksJaya > ---- Jaya
