On 13/10/2009, Stefan Behnel <[email protected]> wrote:
>
> Lydia Patrovic wrote:
>> Note the "main&amp;20090924_2" attribute value, which can be interpreted
>> as an
>> unterminated entity.
>
> :) Nice little Freudian copy&paste quoting error. Here's the line from the
> real 'HTML' file:
>
> <script type="text/javascript" src="merge.php?f=main&20090924_2"></script>
>
> Note the unescaped '&' character in the URL.

I'd have thought the embedded null at byte 532 would be the cause. Try
bytes.replace("\x00", "") before treating it as a c string. Seems to
get the document parsed pretty much as expected for me.

Martin
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to