--- Andrew McConnell <[EMAIL PROTECTED]> wrote: > I'm sure that this has been asked a million times on > the list, but I > can't seem to get to the archive: > > I am using JDK 1.3 on Linux/Xerces 1.4.3 > > I am reading in a document that contains & > characters and the like > inside an element. > eg: <tag>Dog & Cat</tag> > > The SAX parser is splitting up the contents, so that > I get "Dog" in one > call to characters(), and "&" in the next. I want > characters() to give > "Dog & Cat" to me in the same call. > > Is there a feature or a property of some sort that I > need to set in the > XMLReader. It seems like there ought to be a simple > way to do this - but > I'm stumped unless I've rewrite this piece of my > application to be alot > smarter than I think it needs to be! > > Thanks. > > > -- > Andrew McConnell > Socketware, Inc. > [EMAIL PROTECTED]
Andrew, This is behavior compliant with the specification and there is nothing you can (or should be able to) do about it. In fact there is nothing preventing the parsing from giving you 'd', 'o', and 'g' in three separate callbacks for that matter. You've got to keep track of the characters returned in subsequent character callbacks yourself. Even aside from the entity reference behavior you should be doing this or you'll run into problems if you ever try to plug in a different parser implementation which breaks the CDATA up some other way, (eg, the d-o-g example). Hope this helps. -Jason __________________________________________________ Do You Yahoo!? Make a great connection at Yahoo! Personals. http://personals.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
