NW Brad created TIKA-2562: ----------------------------- Summary: tika server parse HTML removes DIVs around hyperlink & adds shape Key: TIKA-2562 URL: https://issues.apache.org/jira/browse/TIKA-2562 Project: Tika Issue Type: Bug Components: gui, parser, server Affects Versions: 1.17 Reporter: NW Brad Attachments: tika_adds_shape_to_hyperlink.html
parsing an HTML file via server: curl -X PUT --upload-file tika_adds_shape_to_hyperlink.html http://localhost:9998/tika --header "Accept: text/html" sent: <div> <a href="http://www.google.com">http://www.google.com</a> </div> received back: <a shape="rect" href="http://www.google.com">http://www.google.com</a> Divs are are gone and a shape has been added -- This message was sent by Atlassian JIRA (v7.6.3#76005)