[Bug 1706274] Re: html2xhtml produces invald XML for MS Office HTML output

2017-09-14 Thread Kjetil Kjernsmo
I'm not 100% sure, since this module is a parser and not a serializer, but it appears HTML::HTML5::Parser is just building a DOM, and the serialization is then done by XML::LibXML. Therefore, it seems likely the bug is indeed in HTML::HTML5::Parser. The upstream bug tracker is at https://rt.cpan.o

[Bug 1706274] Re: html2xhtml produces invald XML for MS Office HTML output

2017-09-14 Thread Kjetil Kjernsmo
I was able to reproduce the bug, and I have committed the example as test data to my own fork of the module: https://github.com/kjetilk/p5-html-html5-parser/commit/c4be3e6ee63d0850079c115ef4274e4c2c3befa9 I'm not a maintainer, so it doesn't bring us much closer to a solution though. Just did it