On Thu, Feb 04, 2010 at 09:24:27AM -0800, Aaron Patterson wrote:
> On Thu, Feb 4, 2010 at 12:53 AM, Daniel Veillard <veill...@redhat.com> wrote:
> > On Wed, Feb 03, 2010 at 08:34:09PM -0800, Aaron Patterson wrote:
> >> I can't seem to pass an encoding to xmlParseInNodeContext.  This is
> >> problematic when dealing with UTF-8 HTML documents.  I can tell
> >> libxml2 what encoding to use when originally parsing the document, but
> >> it looks like that is completely ignored when using
> >> xmlParseInNodeContext.  Reference nodes in HTML documents completely
> >> ignore the original document encoding and use ISO-8859-1.
> >>
> >> Here is a sample program to illustrate the problem:
> >>
> >> http://pastie.org/808860
> >>
> >> I tried putting together a patch, and it didn't seem to work:
> >>
> >> http://pastie.org/808862
> >>
> >> Ideally, I would like a function similar to xmlParseInNodeContext, but
> >> one that takes an encoding as a parameter.  Thanks!
> >
> >  Rather than add Yet Another Entry Point, I think the most logical
> > is to parse using the encoding from the document, since it's an "in
> > context" parsing, i.e. parsing as if the fragment was coming from that
> > document. The encoding switch is a bit harder than what you hoped for,
> > but it's not that hard, the patch enclosed seems to do it for me, please
> > have a try.
> 
> Perfect.  It works great for me!  Thank you very much!

  Okay, pushed to head

> Any suggestions for workarounds to older versions of libxml2?  I'm
> tempted to copy this function to my C code, but I'd rather not if
> possible.

  If the patch applies that should be fine,

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to