Automatically let all valid XHTML 1.0 attributes through from HTML documents
----------------------------------------------------------------------------

                 Key: TIKA-430
                 URL: https://issues.apache.org/jira/browse/TIKA-430
             Project: Tika
          Issue Type: Improvement
          Components: parser
            Reporter: Ken Krugler
            Assignee: Ken Krugler


Many consumers of parse output wouldn't want to process the raw (unnormalized) 
elements they'd get with the IdentityHtmlMapper, but they would want to get any 
standard attributes. For example, with <a> elements they would get any rel 
attribues.

I believe this would require changing the DefaultHtmlMapper to "know" about 
valid attributes for different elements.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to