On Thu, Feb 04, 2010 at 09:24:27AM -0800, Aaron Patterson wrote: > On Thu, Feb 4, 2010 at 12:53 AM, Daniel Veillard <veill...@redhat.com> wrote: > > On Wed, Feb 03, 2010 at 08:34:09PM -0800, Aaron Patterson wrote: > >> I can't seem to pass an encoding to xmlParseInNodeContext. This is > >> problematic when dealing with UTF-8 HTML documents. I can tell > >> libxml2 what encoding to use when originally parsing the document, but > >> it looks like that is completely ignored when using > >> xmlParseInNodeContext. Reference nodes in HTML documents completely > >> ignore the original document encoding and use ISO-8859-1. > >> > >> Here is a sample program to illustrate the problem: > >> > >> http://pastie.org/808860 > >> > >> I tried putting together a patch, and it didn't seem to work: > >> > >> http://pastie.org/808862 > >> > >> Ideally, I would like a function similar to xmlParseInNodeContext, but > >> one that takes an encoding as a parameter. Thanks! > > > > Rather than add Yet Another Entry Point, I think the most logical > > is to parse using the encoding from the document, since it's an "in > > context" parsing, i.e. parsing as if the fragment was coming from that > > document. The encoding switch is a bit harder than what you hoped for, > > but it's not that hard, the patch enclosed seems to do it for me, please > > have a try. > > Perfect. It works great for me! Thank you very much!
Okay, pushed to head > Any suggestions for workarounds to older versions of libxml2? I'm > tempted to copy this function to my C code, but I'd rather not if > possible. If the patch applies that should be fine, Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ dan...@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml