[ https://issues.apache.org/jira/browse/TIKA-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Palsulich updated TIKA-381: --------------------------------- Affects Version/s: (was: 0.6) 1.8 > HtmlParser should strip linefeeds out of links > ---------------------------------------------- > > Key: TIKA-381 > URL: https://issues.apache.org/jira/browse/TIKA-381 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.8 > Reporter: Ken Krugler > Assignee: Ken Krugler > > A number of HTML pages contain links where the URL has a linefeed in the > middle of it. > Browsers such as Firefox will automatically remove the character but Tika > passes it back, which results in a broken URL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)