HtmlParser doesn't extract links from img, map, object, frame, iframe, area,
link
---------------------------------------------------------------------------------
Key: TIKA-463
URL: https://issues.apache.org/jira/browse/TIKA-463
Project: Tika
Issue Type: Bug
Reporter: Ken Krugler
Assignee: Ken Krugler
All of the listed HTML elements can have URLs as attributes, and thus we'd want
to extract those links, if possible.
For elements that aren't valid as XHTML 1.0, there might be some challenges in
the right way to handle this.
But if XHTML 1.0 means the union of "transitional and frameset" variants, then
all of the above are valid, and thus should be emitted by the parser,
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.