Hello Olaf,
Firstly, you probably want to post questions like this to xerces-j-user;
you're much more likely to be answered there. (Besides, this list is
intended for developers of xerces, requests for enhancement etc., rather
than for folks who develop with the product).
At any rate: Your servlet environment is playing tricks on you. Your
standalone code is using Xerces2, whereas your servlet code is using
Xerces1. So it looks like your servlet comes with an older version of
Xerces, and you're picking that up in your testing.
Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone: 905-413-3519, T/L 969-3519
E-mail: [EMAIL PROTECTED]
"Olaf Kittelmann" <[EMAIL PROTECTED]> on 01/07/2002 12:09:40 PM
Please respond to [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
cc:
Subject: ambigous parsing behaviour/whitespace prob
Hi Everybody,
I do have a problem with Xerces, parsing my own ML and Whitespace.
I am trying to read init Data for a complex Object structure from XML
parsing with Xerces.
I do have a class "structureParser" using a XMLReader and an inner class as
contenthandler.
In a static initializer I specify org.apache.xerces.parsers.SAXParser as my
SAXdriver, set up my debugging and set validation to false.
static {
System.setProperty
("org.xml.sax.driver","org.apache.xerces.parsers.SAXParser
");
System.setProperty("debug","false");
String strDebug = System.getProperty("DEBUG");
if (strDebug == null)
strDebug = System.getProperty("debug");
if (strDebug != null && strDebug.equalsIgnoreCase("true"))
debug = true;
else
debug = false;
}
I wrote a main method for testing that takes the path to my XML file as
argument pass it to my XMLReader and parse.
everything works fine, characters is called when there is characters and
ignorable whitespace is called when there are none.
the message stack on debugging looks like this:
de.elmedia.StructureParser$AbmlHandler.ignorableWhitespace(char[], int,
int)
line: 415
org.apache.xerces.parsers.SAXParser(org.apache.xerces.parsers.AbstractSAXPar
ser).ignorableWhitespace(org.apache.xerces.xni.XMLString) line: 404
org.apache.xerces.impl.xs.XMLSchemaValidator.ignorableWhitespace(org.apache.
xerces.xni.XMLString) line: 479
org.apache.xerces.impl.XMLNamespaceBinder.ignorableWhitespace(org.apache.xer
ces.xni.XMLString) line: 612
org.apache.xerces.impl.dtd.XMLDTDValidator.characters(org.apache.xerces.xni.
XMLString) line: 836
org.apache.xerces.impl.XMLDocumentScannerImpl(org.apache.xerces.impl.XMLDocu
mentFragmentScannerImpl).scanContent() line: 836
org.apache.xerces.impl.XMLDocumentScannerImpl$ContentDispatcher(org.apache.x
erces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher).dispatc
h(boolean) line: 1379
org.apache.xerces.impl.XMLDocumentScannerImpl(org.apache.xerces.impl.XMLDocu
mentFragmentScannerImpl).scanDocument(boolean) line: 328
org.apache.xerces.parsers.DTDXSParserConfiguration(org.apache.xerces.parsers
.StandardParserConfiguration).parse(boolean) line: 479
org.apache.xerces.parsers.DTDXSParserConfiguration(org.apache.xerces.parsers
.StandardParserConfiguration).parse(org.apache.xerces.xni.parser.XMLInputSou
rce) line: 521
org.apache.xerces.parsers.SAXParser(org.apache.xerces.parsers.XMLParser).par
se(org.apache.xerces.xni.parser.XMLInputSource) line: 148
org.apache.xerces.parsers.SAXParser(org.apache.xerces.parsers.AbstractSAXPar
ser).parse(org.xml.sax.InputSource) line: 972
Now for my real purpose I am using a servlet that does pretty much the same
thing. it creates a Structureparser, the static initializer is executed and
it passes the same XML document:
to StructureParser and the XMLReader is set not to validate. but now, The
sax parser only triggers the character() method, with Strings that look
like
"| ".
Now, I can still trim the strings and only process the ones that really
contain characters. but the thing I am interested in is: why the heck does
Xerces show this different behaviour when I use exactly the same steps to
set it up?
the message stack this time looks like:
de.elmedia.StructureParser$AbmlHandler.characters(char[], int, int) line:
87
org.apache.xerces.parsers.SAXParser.characters(char[], int, int) line: 1574
org.apache.xerces.validators.common.XMLValidator.processWhitespace(char[],
int, int) line: 654
org.apache.xerces.readers.UTF8Reader.scanContent(org.apache.xerces.utils.QNa
me) line: 2246
org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(bo
olean) line: 1145
org.apache.xerces.framework.XMLDocumentScanner.parseSome(boolean) line: 380
org.apache.xerces.parsers.SAXParser(org.apache.xerces.framework.XMLParser).p
arse(org.xml.sax.InputSource) line: 908
so how can this be? why are the classes from the .framework package used
for
the servlet, and the .implementation ones for the application.?
my XML source looks like this (nothing fancy).
<?xml version ="1.0"?>
<!DOCTYPE Struktur SYSTEM "Struktur.dtd">
<Struktur>
<Kategorie RootTemplate="Services.html">
<LinkObjekt ID="" pic="">Services</LinkObjekt>
<Kategorie RootTemplate="seach.html">
<LinkObjekt ID="" pic="">search</LinkObjekt>
</Kategorie>
<Kategorie RootTemplate="cart.html</Kategorie>
...........
.......
</Struktur>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]