[ 
https://issues.apache.org/jira/browse/PDFBOX-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hacho updated PDFBOX-537:
-------------------------

    Description: 
The endless loop seems to have been introduced with the changes from 
01-Sep-2009 in svn revision 810122 with the addition of the loop to wait for a 
valid dictionary

Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
===================================================================
--- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java    
(revision 793364)
+++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java    
(revision 810122)
@@ -183,7 +183,23 @@
             if( c == '>')
             {
                 done = true;
-            }
+            } 
+            else 
+                if(c != '/') 
+                {
+                    //an invalid dictionary, we are expecting
+                    //the key, read until we can recover
+                    logger().warning("Invalid dictionary, found:" + (char)c + 
" but expected:\''");
+                    int read = pdfSource.read();
+                    while(read != -1 && read != '/' && read != '>')
+                    {
+                        read = pdfSource.read();
+                    }
+                    if(read != -1) 
+                    {
+                        pdfSource.unread(read);
+                    }
+                }
             else
             {
                 COSName key = parseCOSName();
@@ -206,9 +222,12 @@
 
                 if( value == null )
                 {
-                    throw new IOException("Bad Dictionary Declaration " + 
pdfSource );
+                    logger().warning("Bad Dictionary Declaration " + pdfSource 
);
                 }
-                obj.setItem( key, value );
+                else
+                {
+                    obj.setItem( key, value );
+                }
             }
         }
         char ch = (char)pdfSource.read();


  was:
The issue seems to have been introduced on 01-Sep-2009 in svn revision 810122 
with the addition of the loop to wait for a valid dictionary

Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
===================================================================
--- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java    
(revision 793364)
+++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java    
(revision 810122)
@@ -183,7 +183,23 @@
             if( c == '>')
             {
                 done = true;
-            }
+            } 
+            else 
+                if(c != '/') 
+                {
+                    //an invalid dictionary, we are expecting
+                    //the key, read until we can recover
+                    logger().warning("Invalid dictionary, found:" + (char)c + 
" but expected:\''");
+                    int read = pdfSource.read();
+                    while(read != -1 && read != '/' && read != '>')
+                    {
+                        read = pdfSource.read();
+                    }
+                    if(read != -1) 
+                    {
+                        pdfSource.unread(read);
+                    }
+                }
             else
             {
                 COSName key = parseCOSName();
@@ -206,9 +222,12 @@
 
                 if( value == null )
                 {
-                    throw new IOException("Bad Dictionary Declaration " + 
pdfSource );
+                    logger().warning("Bad Dictionary Declaration " + pdfSource 
);
                 }
-                obj.setItem( key, value );
+                else
+                {
+                    obj.setItem( key, value );
+                }
             }
         }
         char ch = (char)pdfSource.read();


    Environment: The problem occurs on certain corrupt streams. I have attached 
the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long and exhibits 
this problem. Not sure, but I think this file was originally longer and was 
somehow cut.  (was: The problem occurs on certain corrupt streams. I have 
attached the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long and 
has this problem.  Not sure, but I think this file was originally longer and 
was somehow cut.)

> Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary() 
> on certain corrupt PDF streams
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-537
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-537
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>         Environment: The problem occurs on certain corrupt streams. I have 
> attached the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long 
> and exhibits this problem. Not sure, but I think this file was originally 
> longer and was somehow cut.
>            Reporter: Hacho
>         Attachments: corrupt-endless-loop-in-0.8.pdf, 
> pdfbox-537-proposed-fix.zip, TestPDFBOX537.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The endless loop seems to have been introduced with the changes from 
> 01-Sep-2009 in svn revision 810122 with the addition of the loop to wait for 
> a valid dictionary
> Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
> ===================================================================
> --- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java  
> (revision 793364)
> +++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java  
> (revision 810122)
> @@ -183,7 +183,23 @@
>              if( c == '>')
>              {
>                  done = true;
> -            }
> +            } 
> +            else 
> +                if(c != '/') 
> +                {
> +                    //an invalid dictionary, we are expecting
> +                    //the key, read until we can recover
> +                    logger().warning("Invalid dictionary, found:" + (char)c 
> + " but expected:\''");
> +                    int read = pdfSource.read();
> +                    while(read != -1 && read != '/' && read != '>')
> +                    {
> +                        read = pdfSource.read();
> +                    }
> +                    if(read != -1) 
> +                    {
> +                        pdfSource.unread(read);
> +                    }
> +                }
>              else
>              {
>                  COSName key = parseCOSName();
> @@ -206,9 +222,12 @@
>  
>                  if( value == null )
>                  {
> -                    throw new IOException("Bad Dictionary Declaration " + 
> pdfSource );
> +                    logger().warning("Bad Dictionary Declaration " + 
> pdfSource );
>                  }
> -                obj.setItem( key, value );
> +                else
> +                {
> +                    obj.setItem( key, value );
> +                }
>              }
>          }
>          char ch = (char)pdfSource.read();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to