DWG parser infinite loop on possibly corrupt file -------------------------------------------------
Key: TIKA-788 URL: https://issues.apache.org/jira/browse/TIKA-788 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.0 Reporter: Stas Shaposhnikov When parsing some dwg items, it is possible that the parser may cause itself to go into an infinite loop. Attached is the file causing the problem. Here is a possible patch that will at least proceed until an error is thrown. === modified file 'tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java' --- tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java 2011-11-24 11:30:33 +0000 +++ tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java 2011-11-25 05:27:41 +0000 @@ -274,8 +274,10 @@ return false; } while (toSkip > 0) { - byte[] skip = new byte[Math.min((int) toSkip, 0x4000)]; - IOUtils.readFully(stream, skip); + byte[] skip = new byte[(int) Math.min(toSkip, 0x4000)]; + if (IOUtils.readFully(stream, skip) == -1) { + return false; //invalid skip + } toSkip -= skip.length; } return true; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira