DWG parser infinite loop on possibly corrupt file
-------------------------------------------------

                 Key: TIKA-788
                 URL: https://issues.apache.org/jira/browse/TIKA-788
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.0
            Reporter: Stas Shaposhnikov


When parsing some dwg items, it is possible that the parser may cause itself to 
go into an infinite loop.

Attached is the file causing the problem.

Here is a possible patch that will at least proceed until an error is thrown.


=== modified file 
'tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java'
--- tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java        
2011-11-24 11:30:33 +0000
+++ tika-parsers/src/main/java/org/apache/tika/parser/dwg/DWGParser.java        
2011-11-25 05:27:41 +0000
@@ -274,8 +274,10 @@
             return false;
         }
         while (toSkip > 0) {
-            byte[] skip = new byte[Math.min((int) toSkip, 0x4000)];
-            IOUtils.readFully(stream, skip);
+            byte[] skip = new byte[(int) Math.min(toSkip, 0x4000)];
+            if (IOUtils.readFully(stream, skip) == -1) {
+               return false; //invalid skip
+            }
             toSkip -= skip.length;
         }
         return true;


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to