[
https://issues.apache.org/jira/browse/PDFBOX-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hacho updated PDFBOX-537:
-------------------------
Description:
The endless loop seems to have been introduced with the changes from
01-Sep-2009 in svn revision 810122 with the addition of the loop to wait for a
valid dictionary
Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
===================================================================
--- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
(revision 793364)
+++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
(revision 810122)
@@ -183,7 +183,23 @@
if( c == '>')
{
done = true;
- }
+ }
+ else
+ if(c != '/')
+ {
+ //an invalid dictionary, we are expecting
+ //the key, read until we can recover
+ logger().warning("Invalid dictionary, found:" + (char)c +
" but expected:\''");
+ int read = pdfSource.read();
+ while(read != -1 && read != '/' && read != '>')
+ {
+ read = pdfSource.read();
+ }
+ if(read != -1)
+ {
+ pdfSource.unread(read);
+ }
+ }
else
{
COSName key = parseCOSName();
@@ -206,9 +222,12 @@
if( value == null )
{
- throw new IOException("Bad Dictionary Declaration " +
pdfSource );
+ logger().warning("Bad Dictionary Declaration " + pdfSource
);
}
- obj.setItem( key, value );
+ else
+ {
+ obj.setItem( key, value );
+ }
}
}
char ch = (char)pdfSource.read();
was:
The issue seems to have been introduced on 01-Sep-2009 in svn revision 810122
with the addition of the loop to wait for a valid dictionary
Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
===================================================================
--- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
(revision 793364)
+++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
(revision 810122)
@@ -183,7 +183,23 @@
if( c == '>')
{
done = true;
- }
+ }
+ else
+ if(c != '/')
+ {
+ //an invalid dictionary, we are expecting
+ //the key, read until we can recover
+ logger().warning("Invalid dictionary, found:" + (char)c +
" but expected:\''");
+ int read = pdfSource.read();
+ while(read != -1 && read != '/' && read != '>')
+ {
+ read = pdfSource.read();
+ }
+ if(read != -1)
+ {
+ pdfSource.unread(read);
+ }
+ }
else
{
COSName key = parseCOSName();
@@ -206,9 +222,12 @@
if( value == null )
{
- throw new IOException("Bad Dictionary Declaration " +
pdfSource );
+ logger().warning("Bad Dictionary Declaration " + pdfSource
);
}
- obj.setItem( key, value );
+ else
+ {
+ obj.setItem( key, value );
+ }
}
}
char ch = (char)pdfSource.read();
Environment: The problem occurs on certain corrupt streams. I have attached
the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long and exhibits
this problem. Not sure, but I think this file was originally longer and was
somehow cut. (was: The problem occurs on certain corrupt streams. I have
attached the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long and
has this problem. Not sure, but I think this file was originally longer and
was somehow cut.)
> Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()
> on certain corrupt PDF streams
> ----------------------------------------------------------------------------------------------------------
>
> Key: PDFBOX-537
> URL: https://issues.apache.org/jira/browse/PDFBOX-537
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 0.8.0-incubator
> Environment: The problem occurs on certain corrupt streams. I have
> attached the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long
> and exhibits this problem. Not sure, but I think this file was originally
> longer and was somehow cut.
> Reporter: Hacho
> Attachments: corrupt-endless-loop-in-0.8.pdf,
> pdfbox-537-proposed-fix.zip, TestPDFBOX537.java
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The endless loop seems to have been introduced with the changes from
> 01-Sep-2009 in svn revision 810122 with the addition of the loop to wait for
> a valid dictionary
> Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
> ===================================================================
> --- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
> (revision 793364)
> +++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
> (revision 810122)
> @@ -183,7 +183,23 @@
> if( c == '>')
> {
> done = true;
> - }
> + }
> + else
> + if(c != '/')
> + {
> + //an invalid dictionary, we are expecting
> + //the key, read until we can recover
> + logger().warning("Invalid dictionary, found:" + (char)c
> + " but expected:\''");
> + int read = pdfSource.read();
> + while(read != -1 && read != '/' && read != '>')
> + {
> + read = pdfSource.read();
> + }
> + if(read != -1)
> + {
> + pdfSource.unread(read);
> + }
> + }
> else
> {
> COSName key = parseCOSName();
> @@ -206,9 +222,12 @@
>
> if( value == null )
> {
> - throw new IOException("Bad Dictionary Declaration " +
> pdfSource );
> + logger().warning("Bad Dictionary Declaration " +
> pdfSource );
> }
> - obj.setItem( key, value );
> + else
> + {
> + obj.setItem( key, value );
> + }
> }
> }
> char ch = (char)pdfSource.read();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.