Re: [xml] UTF-8 decoding bug in HTML parser

2008-10-03 Thread Daniel Veillard
On Wed, Oct 01, 2008 at 11:09:27AM +1000, Michael Day wrote: > Hi Daniel, > Reusing the XML code for this seems to work fine for em and the regression test, but you have probably a more extensive HTML test suite than me ;-) so raise the problem if there is a regression ! Will

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-30 Thread Michael Day
Hi Daniel, Reusing the XML code for this seems to work fine for em and the regression test, but you have probably a more extensive HTML test suite than me ;-) so raise the problem if there is a regression ! Will commit to SVN with the test case, Thanks, I'll check it out. I think this greatly

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-26 Thread Daniel Veillard
On Fri, Sep 26, 2008 at 08:29:44PM +1000, Michael Day wrote: > Hi Daniel, > >> Reusing the XML code for this seems to work fine for em and the >> regression test, but you have probably a more extensive HTML test >> suite than me ;-) so raise the problem if there is a regression ! > > Actually, I

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-26 Thread Daniel Veillard
On Fri, Sep 26, 2008 at 08:24:33PM +1000, Michael Day wrote: > Hi Daniel, > >> Reusing the XML code for this seems to work fine for em and the >> regression test, but you have probably a more extensive HTML test >> suite than me ;-) so raise the problem if there is a regression ! >> Will commit t

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-26 Thread Michael Day
Hi Daniel, Reusing the XML code for this seems to work fine for em and the regression test, but you have probably a more extensive HTML test suite than me ;-) so raise the problem if there is a regression ! Actually, I just remembered one more issue: null bytes in HTML documents terminate t

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-26 Thread Michael Day
Hi Daniel, Reusing the XML code for this seems to work fine for em and the regression test, but you have probably a more extensive HTML test suite than me ;-) so raise the problem if there is a regression ! Will commit to SVN with the test case, Thanks, I'll check it out. I think this greatl

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-26 Thread Daniel Veillard
On Fri, Sep 26, 2008 at 02:44:19PM +1000, Michael Day wrote: > Hi Daniel, > >> See patch attached, i'm commiting it to SVN as this fixes the specific >> test case, all the errors seen when parsing subsequently looks 'normal' >> :-) so I added it to the test suite > > Excellent! > > Would there be

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-25 Thread Michael Day
Hi Daniel, See patch attached, i'm commiting it to SVN as this fixes the specific test case, all the errors seen when parsing subsequently looks 'normal' :-) so I added it to the test suite Excellent! Would there be any chance that you could look at one more related issue affecting the HTM

Re: [xml] UTF-8 decoding bug in HTML parser

2008-09-25 Thread Daniel Veillard
On Thu, Sep 11, 2008 at 06:12:30PM +1000, Michael Day wrote: > Hi, > > The attached file illustrates a UTF-8 decoding bug in the HTML parser, > which can be recreated with: > > $ xmllint --html utf8bug.html > > The last one or two characters in the document are corrupted, and > xmllint repo