[jira] Created: (TIKA-530) InvalidFormatException on a PackagePart in OOXML

2010-10-12 Thread Sjoerd Smeets (JIRA)
InvalidFormatException on a PackagePart in OOXML Key: TIKA-530 URL: https://issues.apache.org/jira/browse/TIKA-530 Project: Tika Issue Type: Bug Affects Versions: 0.8 Reporter:

[jira] Resolved: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement)

2010-10-12 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ken Krugler resolved TIKA-528. -- Resolution: Fixed Fix Version/s: 0.8 Committed: http://svn.apache.org/viewvc/?rev=1021915. Thanks

[jira] Issue Comment Edited: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement)

2010-10-12 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920334#action_12920334 ] Ken Krugler edited comment on TIKA-528 at 10/12/10 4:30 PM: Commi

[jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents.

2010-10-12 Thread Cristian Vat (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cristian Vat updated TIKA-422: -- Attachment: RTFParser.patch Attached updated version of the patch. Changed isUnicode to include characte

[jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents.

2010-10-12 Thread Cristian Vat (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920365#action_12920365 ] Cristian Vat commented on TIKA-422: --- Clarification: The previous patch added some extra sp

[jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents.

2010-10-12 Thread Cristian Vat (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cristian Vat updated TIKA-422: -- Attachment: RTFParser.patch Attached new patch. Added test for checking space after umlaut/encoded chara