[ https://issues.apache.org/jira/browse/TIKA-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021828#comment-13021828 ]
Nick Burch commented on TIKA-642: --------------------------------- The exception is coming from javax.swing, so there may not be that much we can do. However, it looks from a quick google like the most common reason for the swing classes to throw that exception is when your file is corrupt, and closes more tags than it opens > Few of RTF files not extracting properly > ---------------------------------------- > > Key: TIKA-642 > URL: https://issues.apache.org/jira/browse/TIKA-642 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.9, 1.0 > Environment: All > Reporter: Manish > Attachments: FIRM GAS GTC B RED.DOC > > > Few of the RTF files dont get extracted properly. > This is the stack trace: > org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from > org.apache.tika.parser.rtf.RTFParser@616d071a > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:203) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) > Caused by: java.io.IOException: Too many close-groups in RTF text > at javax.swing.text.rtf.RTFParser.write(RTFParser.java:156) > at javax.swing.text.rtf.RTFParser.writeSpecial(RTFParser.java:101) > at javax.swing.text.rtf.AbstractFilter.write(AbstractFilter.java:158) > at javax.swing.text.rtf.AbstractFilter.readFromStream(AbstractFilter.java:88) > at javax.swing.text.rtf.RTFEditorKit.read(RTFEditorKit.java:65) > at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:112) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira