[jira] [Commented] (TIKA-657) Email parser gets into trouble on malformed html in enron corpus

2011-10-13 Thread Jukka Zitting (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126938#comment-13126938 ] Jukka Zitting commented on TIKA-657: In revision 1183109 I increased the default line an

[jira] [Commented] (TIKA-657) Email parser gets into trouble on malformed html in enron corpus

2011-05-27 Thread Mark Butler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040174#comment-13040174 ] Mark Butler commented on TIKA-657: -- I have submitted code to support turning off strict par

[jira] [Commented] (TIKA-657) Email parser gets into trouble on malformed html in enron corpus

2011-05-27 Thread Mark Butler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040149#comment-13040149 ] Mark Butler commented on TIKA-657: -- I took the Enron dataset and processed it using Tika an

[jira] [Commented] (TIKA-657) Email parser gets into trouble on malformed html in enron corpus

2011-05-08 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030467#comment-13030467 ] Julien Nioche commented on TIKA-657: Good idea. We need more tutorials and example for B