characters() may be called multiple times [1][2] for contiguous text. Your ContentHandler needs to accumulate the text returned in each call of characters() until you receive a callback that isn't characters.
[1] http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int) [2] http://xerces.apache.org/xerces2-j/faq-sax.html#faq-2 Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] Chan Wilson <[EMAIL PROTECTED]> wrote on 06/01/2006 05:53:52 AM: > Hi, > > I tried to parse a XML file with standard JAXP SAX. However, for the > following element: > > <SYMBOL SOURCE="SANDP" SOURCETYPE="VENDOR">1104_HK</SYMBOL> > > It always return "04_HK" instead of "1104_HK" (by capturing the > "data" in following code fragment). > > public void characters(char[] ch, int start, int width) throws > SAXException { > if (currentTagInPage != null && ch != null) { > String data = new String(ch, start, width); > > > The following are the software component used: > JDK 1.4.2 > Tomcat 4.1.31 > Xerces Parser 2.8.0 > > Is it a bug in Xerces Parser? Please help. > _______________________________________ > YM - 離線訊息 > 就算你沒有上網,你的朋友仍可以留下訊息給你,當你上網時就能立即看到, > 任何說話都冇走失。 > http://messenger.yahoo.com.hk
