Hello all,
I'm having some trouble with a numeric chracter
reference. I have some well-formed UTF-8 encoded html
that i am pulling from a database table and would like
to parse and manipulate with dom4j. Some of the html
contains numeric character references like ”
to represent a right close quotation mark. After
creating a Document object with SAXReader, the
references are converted to a single character. For
example, ” is converted to, when viewed in a
hex editor, 1C.
So I guess I'd like to know whether there is a means
of disabling the processing of numeric character
references? I realize this may be a parser issue but
was curious if anyone had run into a similar problem.
Thanks in advance for any help.
Kevin
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user