[ https://issues.apache.org/jira/browse/TIKA-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378434#comment-15378434 ]
Ken Krugler commented on TIKA-2033: ----------------------------------- Yes, of course...I was thinking of whether we'd want to extract it as text, and if so, how it should appear in the output. > Value attributes of input elements not extracted from HTML > ----------------------------------------------------------- > > Key: TIKA-2033 > URL: https://issues.apache.org/jira/browse/TIKA-2033 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.10 > Environment: Windows 7, java8 x64 > Reporter: Luis Filipe Nassif > Priority: Minor > > The text of value attributes of input elements currently is not extracted > from HTML files. Note it is rendered by browsers. I tried using > IdentityHtmlMapper and played with HtmlSchema with no luck. Simple test HTML > below: > <HTML><body><input value='text'></input></body></HTML> -- This message was sent by Atlassian JIRA (v6.3.4#6332)