Package: wnpp
Severity: wishlist

* Package name    : html5tidy
  Version         : git master
  Upstream Author : Michael Murtaugh & The active archives contributors
* URL             : https://github.com/aleray/html5tidy.git
* License         : GPLv3+
  Programming Lang: Python 2
  Description     : “tidy” HTML 5 in the wild to well-formed XML or HTML

Since tidy fails hard on many HTML 5 documents (e.g. zero output)
this package can be used to transform in-the-wild HTML 5 documents
to input xmlstarlet can actually act on, e.g. for data extraction
with XPath and XSLT via “xmlstarlet sel”.

Reply via email to