HTML parser should produce XHTML SAX events
-------------------------------------------

                 Key: TIKA-128
                 URL: https://issues.apache.org/jira/browse/TIKA-128
             Project: Tika
          Issue Type: Improvement
          Components: parser
            Reporter: Jukka Zitting


The current HTML parser just sanitizes the input HTML and passes it forward 
with no structural changes.

Unfortunately this is incompatible with the other Tika parsers that produce 
XHTML output, and so IMHO we should be outputting XHTML also from the HTML 
parser.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to