Hi Michael, On Sun, Mar 30, 2008 at 5:35 PM, Michael Glavassevich <[EMAIL PROTECTED]> wrote: > Hi Daniel, > > "Daniel Yokomizo" <[EMAIL PROTECTED]> wrote on 03/29/2008 04:45:24 > PM: > > > > Hi, > > > > I'm parsing (disabling validation) a document that declared a DTD > > but I would like to get the raw attribute values instead of the > > normalized values. In particular I need to keep entity references as > > they were written. I came up with this FAQ > > (http://xerces.apache.org/xerces-j/faq-write.html#faq-7) that seems to > > declare that it is impossible (i.e. attribute normalization happens if > > there's a DTD present) and I found the XMLScanner class that, via the > > method scanAttributeValue, does the attribute normalization. I noticed > > that we have a getNonNormalizedValue() method but the SAX parser layer > > uses AttributesProxy which hides the getNonNormalizedValue() method. > > That method is part of XNI [1]. If you really need the non-normalized text > you'd need to change your application so that it uses XNI directly (rather > than SAX).
Thanks for your help (again). I was hoping to use the SAX interface and not depend explicitly on Xerces, because I'm developing a library which will be (hopefully) independent of the SAX implementation. There's a hack I can do to "trick" Xerces, which will work with any parser too, and I'll probably do it (essentially I'll decorate the reader I'm giving to the parser transforming every & into & but after it's resolved by the parser it'll become & again, so & becomes &amp; which the parser transform into &. > > Is there any way to configure Xerces to not normalize attribute > > values even when the DTD is declared? > > Whether your document has a DTD or not is irrelevant. The FAQ (on the > Xerces 1.x site) you read is wrong. Normalization [2] is required for every > attribute value. You cannot disable this behaviour. > > > Best regards, > > Daniel Yokomizo > > > > Thanks. > > [1] http://xerces.apache.org/xerces2-j/javadocs/xni/index.html > [2] http://www.w3.org/TR/2006/REC-xml-20060816/#AVNormalize > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: [EMAIL PROTECTED] > E-mail: [EMAIL PROTECTED] Best regards, Daniel Yokomizo. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]