Re: [VOTE] Release Apache PDFBox 2.0.0

2016-03-19 Thread Andreas Lehmkuehler

Hi,

Am 14.03.2016 um 18:35 schrieb Andreas Lehmkuehler:

Please vote on releasing this package as Apache PDFBox 2.0.0.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

 [ ] +1 Release this package as Apache PDFBox 2.0.0
 [ ] -1 Do not release this package because...


Just a friendly reminder, there are round about 23 hours left to check the 
release and to cast your vote.


TIA
Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3267) Using threads results in different images

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198498#comment-15198498
 ] 

John Hewson edited comment on PDFBOX-3267 at 3/17/16 12:49 AM:
---

Looking at the different images you posted, my first suspect would be glyph 
caching inside TTF fonts, because the fonts are shared between threads and it 
looks like the glyphs are mixed up.


was (Author: jahewson):
Looking at the different images you posted, my first suspect would be glyph 
caching inside TTF fonts, because the fonts are shared between threads.

> Using threads results in different images
> -
>
> Key: PDFBOX-3267
> URL: https://issues.apache.org/jira/browse/PDFBOX-3267
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Attachments: ComparePDF.java, bmpwithdpi.pdf, bmpwithdpi2.pdf, 
> t1.png, t2.png
>
>
> If i dont use threads images are the same
> java -cp pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:. ComparePDF
> Exception in thread "main" java.io.IOException: Not equals
>   at ComparePDF.main(ComparePDF.java:21)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2025) Some fonts do not print

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2025:
---
Fix Version/s: 2.0.0

> Some fonts do not print
> ---
>
> Key: PDFBOX-2025
> URL: https://issues.apache.org/jira/browse/PDFBOX-2025
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Windows 7
>Reporter: John Hewson
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> From the mailing list:
> When I try to print the pdf available
> here there
> are 2 boxes on the second page with text "CV 138" and "D946". both text are
> in black boxes. When this is printed out on windows machine the text in the
> 2 boxes is not there but on a mac its fine. Using pdffonts command it shows
> their are 3 fonts Helvetica, Helvetica-Bold and Courier that are not
> embedded in the pdf.  Could this be causing it?
> How do I solve it so the text will be visible in the boxes?
> Thank you
> Regards
> Joseph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3006) Resource leak in 2.0-SNAPSHOT in PDDocument.load(File...)

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-3006:
---
Fix Version/s: 2.0.0

> Resource leak in 2.0-SNAPSHOT in PDDocument.load(File...)
> -
>
> Key: PDFBOX-3006
> URL: https://issues.apache.org/jira/browse/PDFBOX-3006
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel
>Reporter: Tim Allison
> Fix For: 2.0.0
>
>
> If a user calls PDDocument#load(File...) around line 907 in trunk and there's 
> a parse exception, the {{raFile}} is never closed.  This emerged as a actual 
> issue while running tika-batch to support PDFBOX-2252 and then via code 
> review.
> As a side note, I only looked very quickly, but it isn't clear to me where 
> {{raFile}} is closed under normal circumstances.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197641#comment-15197641
 ] 

Tilman Hausherr commented on PDFBOX-3274:
-

As I expected, it works fine with the 2.0 version. Poor font rendering is a 
problem with the 1.8 versions. The command line tools are explained here:
https://pdfbox.apache.org/2.0/commandline.html

Alternatively, you could change the creation of the PDF file, e.g. try a 
standard 14 font (e.g. Helvetica) instead of Verdana. This would make smaller 
files too.

An advice about the images: the royal seal has artefacts because it is jpeg 
(DCTDecode) encoded. Better get a non lossy format, e.g. Flate. The cyan color 
box can be painted as a box instead of as an image, that would make the PDF 
smaller.

> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Reopened] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson reopened PDFBOX-3274:
-

> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3078) Text height coming in at half size, regression from 1.8

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-3078:
---
Fix Version/s: 2.0.0

> Text height coming in at half size, regression from 1.8
> ---
>
> Key: PDFBOX-3078
> URL: https://issues.apache.org/jira/browse/PDFBOX-3078
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Reporter: Joel Hirsh
> Fix For: 2.0.0
>
> Attachments: PDFBOX-679-toobig-marked-1-modified.png, 
> PDFBOX-679-toobig-marked-1-original.png, wrongsize.pdf
>
>
> Running 11/1 Dvlp build.
> PrintTextLocations on attached file has height of 2.9, which is incorrect.
> String[30.67,144.80005 fs=9.0 xscale=9.0 height=2.9236078 space=2.5020003 
> width=5.0040016]1
> String[35.704,144.80005 fs=9.0 xscale=9.0 height=2.9236078 space=2.5020003 
> width=5.003998]2
> String[40.707996,144.80005 fs=9.0 xscale=9.0 height=2.9236078 space=2.5020003 
> width=5.003998]8
> String[45.711994,144.80005 fs=9.0 xscale=9.0 height=2.9236078 space=2.5020003 
> width=5.003998]6
> String[50.715992,144.80005 fs=9.0 xscale=9.0 height=2.9236078 space=2.5020003 
> width=5.003998]2
> String[63.7,144.80005 fs=9.0 xscale=9.0 height=2.9236078 space=2.5020003 
> width=4.2210045]^
> Same file, Version 1.8 has height of 6.5, which is about right:
> String[30.67,144.80005 fs=9.0 xscale=9.0 height=6.327 space=2.5020003 
> width=5.0040016]1
> String[35.704,144.80005 fs=9.0 xscale=9.0 height=6.327 space=2.5020003 
> width=5.0040016]2
> String[40.708,144.80005 fs=9.0 xscale=9.0 height=6.4980006 space=2.5020003 
> width=5.0040016]8
> String[45.712,144.80005 fs=9.0 xscale=9.0 height=6.4980006 space=2.5020003 
> width=5.0040016]6
> String[50.716003,144.80005 fs=9.0 xscale=9.0 height=6.4980006 space=2.5020003 
> width=5.0040016]2
> String[63.87,144.80005 fs=9.0 xscale=9.0 height=3.8160002 space=2.5020003 
> width=4.220997]^



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread Arthur Vuijk (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197851#comment-15197851
 ] 

Arthur Vuijk commented on PDFBOX-3274:
--

I've tested the 2.0 version and changed some of the coding. The PDF now prints 
ok. However I still receive the warnings below:
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+33 (33) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+3 (3) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+51 (51) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+82 (82) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+86 (86) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+87 (87) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+68 (68) in font CIHBGA+Verdana
mrt 16, 2016 7:01:18 PM org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode
WARNING: No Unicode mapping for CID+71 (71) in font CIHBGA+Verdana

> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3275) Show glyph bounds in DrawPrintTextLocations

2016-03-19 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3275:

Attachment: 032431.pdf

> Show glyph bounds in DrawPrintTextLocations
> ---
>
> Key: PDFBOX-3275
> URL: https://issues.apache.org/jira/browse/PDFBOX-3275
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
> Attachments: 032431.pdf
>
>
> There have been repeated discussions about getting actual glyph bounds, but 
> no code has been written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-1949) Regression: Lines missing in rendered image

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-1949:
---
Fix Version/s: 2.0.0

> Regression: Lines missing in rendered image
> ---
>
> Key: PDFBOX-1949
> URL: https://issues.apache.org/jira/browse/PDFBOX-1949
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>  Labels: regression
> Fix For: 2.0.0
>
> Attachments: PDFBOX-1435_12102.pdf, pdfbox-1435_12102.pdf-1.png
>
>
> The lines are missing in the attached image of PDFBOX-1435.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2167) NPE in PDTrueTypeFont.makeFontDescriptor

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2167:
---
Fix Version/s: 2.0.0

> NPE in PDTrueTypeFont.makeFontDescriptor
> 
>
> Key: PDFBOX-2167
> URL: https://issues.apache.org/jira/browse/PDFBOX-2167
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: John Hewson
> Fix For: 2.0.0
>
> Attachments: 268554.pdf
>
>
> I get an NPE with the file from
> http://digitalcorpora.org/corp/nps/files/govdocs1/268/268554.pdf
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.makeFontDescriptor(PDTrueTypeFont.java:292)
>   at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontDescriptor(PDTrueTypeFont.java:150)
>   at org.apache.pdfbox.pdmodel.font.PDFont.getFontWidth(PDFont.java:814)
> IOException for file 268554.pdf
>   at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontWidth(PDTrueTypeFont.java:382)
>   at org.apache.pdfbox.pdmodel.font.PDFont.getFontWidth(PDFont.java:312)
>   at org.apache.pdfbox.pdmodel.font.PDFont.getSpaceWidth(PDFont.java:855)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:328)
>   at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:44)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:521)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:267)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:226)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:209)
>   at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:174)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:227)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:109)
> {code}
> I first thought it is the same as PDFBOX-2165, but it's a different line 
> number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3275) Show glyph bounds in DrawPrintTextLocations

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199984#comment-15199984
 ] 

ASF subversion and git services commented on PDFBOX-3275:
-

Commit 1735463 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1735463 ]

PDFBOX-3275: show glyph bounds; refactor existing code

> Show glyph bounds in DrawPrintTextLocations
> ---
>
> Key: PDFBOX-3275
> URL: https://issues.apache.org/jira/browse/PDFBOX-3275
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
>
> There have been repeated discussions about getting actual glyph bounds, but 
> no code has been written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2164) NPE when reading non-terminal fields

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2164:
---
Fix Version/s: 1.8.7
   2.0.0

> NPE when reading non-terminal fields
> 
>
> Key: PDFBOX-2164
> URL: https://issues.apache.org/jira/browse/PDFBOX-2164
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Parsing
>Affects Versions: 1.8.6, 2.0.0
>Reporter: John Hewson
>Assignee: John Hewson
> Fix For: 1.8.7, 2.0.0
>
> Attachments: 4038171_SubstitutionofCounselCOS.pdf, 
> 4251286_ACombinedBates1to72_r.pdf
>
>
> I'm getting an NPE on the following two files when calling 
> PDDocumentCatalog#getAcroForm() on the two attached PDF files.
> Note: These two PDF files are from the public record in the US.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2145) Clean up PDFStreamEngine and PDFTextStripper

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2145:
---
Fix Version/s: 2.0.0

> Clean up PDFStreamEngine and PDFTextStripper
> 
>
> Key: PDFBOX-2145
> URL: https://issues.apache.org/jira/browse/PDFBOX-2145
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: John Hewson
>Assignee: John Hewson
>Priority: Minor
> Fix For: 2.0.0
>
>
> PDFStreamEngine and PDFTextStripper don't really meet our coding conventions 
> and have several unused methods and deprecated code which can safely be 
> removed.
> This should clear the way to fixing some bugs in PDFStreamEngine, 
> PDFTextStripper and the various PDFont classes related to text encoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson closed PDFBOX-3274.
---
Resolution: Won't Fix

> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2169) NPE in PDTrueTypeFont.makeFontDescriptor

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2169:
---
Fix Version/s: 2.0.0

> NPE in PDTrueTypeFont.makeFontDescriptor
> 
>
> Key: PDFBOX-2169
> URL: https://issues.apache.org/jira/browse/PDFBOX-2169
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: John Hewson
> Fix For: 2.0.0
>
> Attachments: 000153.pdf
>
>
> The attached file brings this exception when rendering or when extracting text
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.makeFontDescriptor(PDTrueTypeFont.java:161)
>   at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontDescriptor(PDTrueTypeFont.java:150)
>   at org.apache.pdfbox.pdmodel.font.PDFont.getFontWidth(PDFont.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontWidth(PDTrueTypeFont.java:382)
>   at org.apache.pdfbox.pdmodel.font.PDFont.getFontWidth(PDFont.java:312)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:377)
>   at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:44)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:508)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3276) Double encryption dictionary for some files

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201097#comment-15201097
 ] 

Tilman Hausherr commented on PDFBOX-3276:
-

[~prstahle] here's a workaround for you until we've fixed the problem:
{code}
COSDictionary trailer = document.getDocument().getTrailer();
trailer.removeItem(COSName.XREF_STM);
trailer.removeItem(COSName.TYPE);
trailer.removeItem(COSName.INDEX);
trailer.removeItem(COSName.W);
trailer.removeItem(COSName.FILTER);
trailer.removeItem(COSName.DECODE_PARMS);
trailer.removeItem(COSName.LENGTH);
document.getDocument().setIsXRefStream(false);
{code}
It will increase the length of the files a little bit. Please test whether the 
effect you described goes away.

> Double encryption dictionary for some files
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-1996) PDSeparation optimization

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-1996:
---
Fix Version/s: 2.0.0

> PDSeparation optimization
> -
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: pdfbox.patch, pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 2.0.0

2016-03-19 Thread Andreas Lehmkuehler

Hi,

Am 14.03.2016 um 18:35 schrieb Andreas Lehmkuehler:

Please vote on releasing this package as Apache PDFBox 2.0.0.


  +1 Tilman Hausherr
  +1 Timo Boehme
  +1 Johne Hewson
  +1 Maruan Sahyoun
  +1 Andreas Lehmkühler


Thanks for your help and support!! I'm going to push the release out.

BR
Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: GSoC 2016

2016-03-19 Thread Tilman Hausherr

Am 17.03.2016 um 01:29 schrieb John Hewson:

We could do with a new text extractor, from the ground-up.


But not this year with me as main mentor, I'm not deep enough in text 
extraction yet. Although I'm willing to support whatever somebody else 
does with my opinion, suggestions, and tests.


The time investment in GSoC for me was about 1 hour per day, maybe more 
at the beginning, and less later, and more at the very end.


Tilman



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3267) Using threads results in different images

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201772#comment-15201772
 ] 

Tilman Hausherr commented on PDFBOX-3267:
-

Can you try with some older 2.0 builds (e.g. RC1 and earlier), does the effect 
also happen? I can't help you myself, because I can't reproduce it.

> Using threads results in different images
> -
>
> Key: PDFBOX-3267
> URL: https://issues.apache.org/jira/browse/PDFBOX-3267
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Attachments: ComparePDF.java, bmpwithdpi.pdf, bmpwithdpi2.pdf, 
> t1.png, t2.png
>
>
> If i dont use threads images are the same
> java -cp pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:. ComparePDF
> Exception in thread "main" java.io.IOException: Not equals
>   at ComparePDF.main(ComparePDF.java:21)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-1939) Store all stroke information in the graphics state

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-1939:
---
Fix Version/s: 2.0.0

> Store all stroke information in the graphics state
> --
>
> Key: PDFBOX-1939
> URL: https://issues.apache.org/jira/browse/PDFBOX-1939
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: John Hewson
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Recently PDFBOX-1917 has fixed an issue with how the current AWT stroke is 
> calculated. This prompted me to look at the file from PDFBOX-1094 which 
> contains separate stroking errors. This led to the identification of a 
> problem: the BasicStroke is being used to keep track of the stroke state, 
> rather than using the information from the graphics state. This fails when 
> the graphics state is modified e.g. Save/Restore and the BasicStroke in 
> PageDrawer is not correspondingly updated.
> Having looked at the code, it seems that this is a long-standing issue. The 
> solution is to remove the BasicStroke variable from PageDrawer and to 
> calculate it each time it is needed, using only the information stored in the 
> graphics state. The following classes which directly modify the BasicStroke 
> can be removed:
> pagedrawer.SetLineCapStyle
> pagedrawer.SetLineDashPattern
> pagedrawer.SetLineJoinStyle
> pagedrawer.SetLineMiterLimit
> pagedrawer.SetLineWidth



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3198) Visible Signature N2 layer

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202715#comment-15202715
 ] 

Tilman Hausherr commented on PDFBOX-3198:
-

Still waiting for example code.

> Visible Signature N2 layer
> --
>
> Key: PDFBOX-3198
> URL: https://issues.apache.org/jira/browse/PDFBOX-3198
> Project: PDFBox
>  Issue Type: New Feature
>  Components: Signing
>Reporter: Frank Cornelis
>Priority: Minor
> Fix For: 1.8.12
>
> Attachments: pdfbox-n2-2.patch
>
>
> The patch adds N2 layer support to visible signatures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-3278) "Throws IOException" in PDFTextStripper constructor is useless

2016-03-19 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson closed PDFBOX-3278.
---
Resolution: Invalid

No, it's not useless. As Tilman says, the same behaviour is simply being 
inherited from PDFTextStreamEngine's constructor in 2.0, via the implicit 
super() call.

> "Throws IOException" in PDFTextStripper constructor is useless
> --
>
> Key: PDFBOX-3278
> URL: https://issues.apache.org/jira/browse/PDFBOX-3278
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Nicolas M
>Priority: Trivial
>
> In version 1.8.x, PDFTextStripper could throw an IOException because the 
> loading of a properties file.  In 2.x, the properties file doesn't exist 
> anymore but the constructor is always "public PDFTextStripper() throws 
> IOException".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3276) Double encryption dictionary for some files

2016-03-19 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3276:

Component/s: Writing

> Double encryption dictionary for some files
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3276) Double encryption dictionary for some files

2016-03-19 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3276:

Description: 
This was first mentioned by Patrick S. in the mailing list:
{quote}
This is not a general problem and only occurs with original PDF generated with 
3D content using Anark. The file when loaded seems to have encrypted and loads 
just find in Adobe Reader, but when we try to do a "Save As" we get the 
following error:
"The document could not be saved. There was a problem reading this document 21."

If I do a control click on the "ok" button. I get the following message:
"This direct object already has a container."
{quote}
I can reproduce the effect with the attached file by using the Encrypt command 
line tool. A look at the file shows a double dictionary:
{code}
593 0 obj
<<
/Filter /Standard
/V 1
/R 3
/Length 40
/P -4
/O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
/U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
>>
endobj
594 0 obj
<<
/ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
/Info 7 0 R
/Root 1 0 R
/Encrypt <<
/Filter /Standard
/V 1
/R 3
/Length 40
/P -4
/O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
/U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
>>
{code}
I don't know if this is the cause, but it doesn't belong there.

  was:
This was first mentioned by Patrick S. in the mailing list:
{quote}
This is not a general problem and only occurs with original PDF generated with 
3D content using Anark. The file when loaded seems to have encrypted and loads 
just find in Adobe Reader, but when we try to do a "Save As" we get the 
following error:
"The document could not be saved. There was a problem reading this document 21."

If I do a control click on the "ok" button. I get the following message:
"This direct object already has a container."
{quote}
I can reproduce the effect with the attached file by using the Encrypt command 
line tool.


> Double encryption dictionary for some files
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: GSoC 2016

2016-03-19 Thread John Hewson

> On 15 Mar 2016, at 23:09, Sumit Saha  wrote:
> 
> Hi Everyone,
> Can you please point out any good documents on transparency groups apart from 
> PDF spec 1.7.

Yes, there’s an Adobe technical note, it’s 80 pages long:

http://www.planetpdf.com/planetpdf/pdfs/PDF_Transparency2.pdf

— John

> Thanks
> Sumit
> 
> -Original Message-
> From: Maruan Sahyoun [mailto:sahy...@fileaffairs.de] 
> Sent: Wednesday, March 16, 2016 1:51 AM
> To: dev@pdfbox.apache.org
> Subject: Re: GSoC 2016
> 
> well - we could need some help in improving the annotation support 
> (construction, appearance generation ..) - I'm not a skilled enough 
> programmer to mentor someone though 
> 
>> Am 15.03.2016 um 21:18 schrieb Tilman Hausherr :
>> 
>> I don't have a good project idea for GSoC 2016, which is sad because I like 
>> the concept.
>> 
>> Here some not-good project ideas:
>> 
>> - JPX converter - we really need that, but I don't have enough skills 
>> to be able to mentor that. I had an idea who could mentor that and he 
>> would be perfect, but he doesn't have the time :-(
>> - Transparency groups - several skilled people have tried and have not fully 
>> succeeded in this, so it could well be that a student tries and then fails 
>> and everybody is unhappy.
>> 
>> Thus this year, I'll just relax and maybe do some stuff on my own.
>> 
>> Tilman
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For 
>> additional commands, e-mail: dev-h...@pdfbox.apache.org
>> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional 
> commands, e-mail: dev-h...@pdfbox.apache.org
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2172) Extra options for PDFToImage

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2172:
---
Fix Version/s: 2.0.0

> Extra options for PDFToImage
> 
>
> Key: PDFBOX-2172
> URL: https://issues.apache.org/jira/browse/PDFBOX-2172
> Project: PDFBox
>  Issue Type: Improvement
>Reporter: John Hewson
>Priority: Minor
> Fix For: 2.0.0
>
>
> I've added a {{-page}} command line option for PDFToImage which is equivalent 
> to setting {{-startPage}} and {{-endPage}} to the same value.
> I've also renamed {{-resolution}} to {{-dpi}} because it's actually a DPI. 
> I've still allowed the {{-resolution}} flag to be passed, it's just no longer 
> documented.
> Likewise {{-imageType}} is now the more standard {{-format}} but the original 
> is still permitted, though undocumented.
> Same again for {{-outputPrefix}} which is now {{-prefix}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3267) Using threads results in different images

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198498#comment-15198498
 ] 

John Hewson commented on PDFBOX-3267:
-

Looking at the different images you posted, my first suspect would be glyph 
caching inside TTF fonts.

> Using threads results in different images
> -
>
> Key: PDFBOX-3267
> URL: https://issues.apache.org/jira/browse/PDFBOX-3267
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Attachments: ComparePDF.java, bmpwithdpi.pdf, bmpwithdpi2.pdf, 
> t1.png, t2.png
>
>
> If i dont use threads images are the same
> java -cp pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:. ComparePDF
> Exception in thread "main" java.io.IOException: Not equals
>   at ComparePDF.main(ComparePDF.java:21)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3276) Double encryption dictionary for files with XRef stream

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201781#comment-15201781
 ] 

ASF subversion and git services commented on PDFBOX-3276:
-

Commit 1735641 from [~tilman] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1735641 ]

PDFBOX-3276: /Encrypt dictionary has already been written, so don't make it 
direct and write it a second time

> Double encryption dictionary for files with XRef stream
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3030) Enhance documentation for PDFBox 2.0.0

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202738#comment-15202738
 ] 

ASF subversion and git services commented on PDFBOX-3030:
-

Commit 01f0dbc5f4d22757628c2641024600e56932b50a in pdfbox-docs's branch 
refs/heads/master from [~msahyoun]
[ https://git-wip-us.apache.org/repos/asf?p=pdfbox-docs.git;h=01f0dbc ]

PDFBOX-3030: add info for PDOutlineNode changes


> Enhance documentation for PDFBox 2.0.0
> --
>
> Key: PDFBOX-3030
> URL: https://issues.apache.org/jira/browse/PDFBOX-3030
> Project: PDFBox
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
> Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3030) Enhance documentation for PDFBox 2.0.0

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202737#comment-15202737
 ] 

ASF subversion and git services commented on PDFBOX-3030:
-

Commit 017c8d585dd016e1a41c3caa497a49412d9a975f in pdfbox-docs's branch 
refs/heads/master from [~msahyoun]
[ https://git-wip-us.apache.org/repos/asf?p=pdfbox-docs.git;h=017c8d5 ]

PDFBOX-3030: fix typo


> Enhance documentation for PDFBox 2.0.0
> --
>
> Key: PDFBOX-3030
> URL: https://issues.apache.org/jira/browse/PDFBOX-3030
> Project: PDFBox
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
> Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2137) Rendering of Type3 string fails with NPE

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2137:
---
Fix Version/s: 2.0.0

> Rendering of Type3 string fails with NPE
> 
>
> Key: PDFBOX-2137
> URL: https://issues.apache.org/jira/browse/PDFBOX-2137
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Petr Slaby
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
> Attachments: 000536.pdf, PageDrawer.drawType3String.patch
>
>
> Rendering of the attached PDF fails with a NPE in PDFStreamEngine at the line 
> 395 (float spaceWidthDisp = ...) because the textMatrix field is null. The 
> reason is that the textMatrix gets reset while processing the first character 
> of the string in PageDrawer#drawType3String(). The attached patch fixes the 
> problem, but I am not quite sure whether it is the right solution.
> To debug this, set a breakpoint in PDFStreamEngine#processEncodedText() with 
> the condition "string.length == 2 && string[1] == 67" and watch the 
> textMatrix field vanish after the first character has been processed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-1705) can not Write Hebrew and Chinese word into a PDF

2016-03-19 Thread Tiruppathi Rajan Gunaseelan (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200295#comment-15200295
 ] 

Tiruppathi Rajan Gunaseelan commented on PDFBOX-1705:
-

Hi,

I am unable to write Chinese content into a PDF using your attached 
EmbeddedFonts.java. Later I found way to fix it by feeding in the 
"arialuni.ttf" instead of LiberationSans-Regular.ttf in your example. I had 
confirmed the Chinese PDF generation by writing a hardcoded Chinese data in my 
example. Later I had extended this piece of logic to write the desired PDF in 
my application but got into "java.lang.IllegalArgumentException: U+008A is not 
available in this font's encoding: WinAnsiEncoding". Please advise.

1. From my application, i call an external Webservice and that returns the 
Chinese + English content in response as a String,
2. I read this response string using UTF-8 as the Charset and unmarshal it to 
Java bean using JAXRB.
3. To the contentStream.drawString() i passed the string calling the getter 
method on the pojo which has Chinese data stored in a String variable. This is 
where i am getting the above exception. Please help me to fix this.

I am using PDFBox version 2.0.0.-RC3. Your help is much appreciated.

Regards,
Tiru

> can not Write Hebrew and Chinese word into a PDF 
> -
>
> Key: PDFBOX-1705
> URL: https://issues.apache.org/jira/browse/PDFBOX-1705
> Project: PDFBox
>  Issue Type: Bug
>  Components: Writing
>Affects Versions: 1.8.1
>Reporter: meiyuanxun
> Fix For: 2.0.0
>
>
> Can not write Hebrew or Chinese into PDF file. It shows unreadable codes.  If 
> it does not support with latest version, please comments me. Thank you.
> {code}
> PDDocument document = new PDDocument();
> PDPage page = new PDPage();
> document.addPage( page );
> PDFont font = PDTrueTypeFont.loadTTF(document, "pdf/simkai.ttf");
> PDPageContentStream contentStream = new PDPageContentStream(document, page);
> contentStream.beginText();
> contentStream.setFont( font, 12 );
> contentStream.moveTextPositionByAmount( 100, 700 );
> contentStream.drawString("中文 = Chinese");
> contentStream.drawString("Hebrew= העתק");
> contentStream.endText();
> contentStream.close();
> document.save( "pdf/Hello World.pdf");
> document.close();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-1705) can not Write Hebrew and Chinese word into a PDF

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200304#comment-15200304
 ] 

Tilman Hausherr commented on PDFBOX-1705:
-

This isn't a forum. You are writing to a closed issue about a "how to" problem. 
Please read https://pdfbox.apache.org/support.html and then ask your question 
in the mailing list or in stackoverflow.

> can not Write Hebrew and Chinese word into a PDF 
> -
>
> Key: PDFBOX-1705
> URL: https://issues.apache.org/jira/browse/PDFBOX-1705
> Project: PDFBox
>  Issue Type: Bug
>  Components: Writing
>Affects Versions: 1.8.1
>Reporter: meiyuanxun
> Fix For: 2.0.0
>
>
> Can not write Hebrew or Chinese into PDF file. It shows unreadable codes.  If 
> it does not support with latest version, please comments me. Thank you.
> {code}
> PDDocument document = new PDDocument();
> PDPage page = new PDPage();
> document.addPage( page );
> PDFont font = PDTrueTypeFont.loadTTF(document, "pdf/simkai.ttf");
> PDPageContentStream contentStream = new PDPageContentStream(document, page);
> contentStream.beginText();
> contentStream.setFont( font, 12 );
> contentStream.moveTextPositionByAmount( 100, 700 );
> contentStream.drawString("中文 = Chinese");
> contentStream.drawString("Hebrew= העתק");
> contentStream.endText();
> contentStream.close();
> document.save( "pdf/Hello World.pdf");
> document.close();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3276) Double encryption dictionary for some files

2016-03-19 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-3276:
---

 Summary: Double encryption dictionary for some files
 Key: PDFBOX-3276
 URL: https://issues.apache.org/jira/browse/PDFBOX-3276
 Project: PDFBox
  Issue Type: Bug
  Components: Crypto
Affects Versions: 2.0.0, 2.0.1, 2.1.0
Reporter: Tilman Hausherr


This was first mentioned by Patrick S. in the mailing list:
{quote}
This is not a general problem and only occurs with original PDF generated with 
3D content using Anark. The file when loaded seems to have encrypted and loads 
just find in Adobe Reader, but when we try to do a "Save As" we get the 
following error:
"The document could not be saved. There was a problem reading this document 21."

If I do a control click on the "ok" button. I get the following message:
"This direct object already has a container."
{quote}
I can reproduce the effect with the attached file by using the Encrypt command 
line tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3278) "Throws IOException" in PDFTextStripper constructor is useless

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201771#comment-15201771
 ] 

Tilman Hausherr commented on PDFBOX-3278:
-

But the base class PDFTextStreamEngine constructor throws it because it is 
loading a glyph list. Did you try to build?

> "Throws IOException" in PDFTextStripper constructor is useless
> --
>
> Key: PDFBOX-3278
> URL: https://issues.apache.org/jira/browse/PDFBOX-3278
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Nicolas M
>Priority: Trivial
>
> In version 1.8.x, PDFTextStripper could throw an IOException because the 
> loading of a properties file.  In 2.x, the properties file doesn't exist 
> anymore but the constructor is always "public PDFTextStripper() throws 
> IOException".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Open TrueType Fonts

2016-03-19 Thread John Hewson

> On 17 Mar 2016, at 13:29, George Sexton  wrote:
> 
> I was looking through things on my server and I'm seeing that my application 
> has about 130 TrueType font files open. I'm running this under Tomcat so the 
> process is persistent.
> 
> I've dug through the source code a little bit and I can see that 
> AbstractTTFParser.parseTTF(File ttfFile) explicitly doesn't close the stream. 
> I'm guessing this is because there's a way to do deferred parsing of the file.

Yes, that’s right.

> I looked at o.a.fontbox.util.FontManager as well, and it looks like when it 
> gets created, it's loading all available truetype fonts and at least getting 
> the Font Names table to map the font names to the constructed TrueTypeFont 
> objects.

That’s right, though it only does this once and then saves the naming metadata 
to an on-disk cache.

> Is there ever a point when the file input stream could be closed? Is there 
> some operation performed on TrueTypeFont that would definitively conclude 
> access to the stream?

When we do the scan, we close each font immediately afterwards. If PDFBox then 
goes on to use that font, we’ll re-open the file and then cache the new font. 
The cache is a map of SoftReference objects, which get released when the JVM 
feels like it. But now I’m wondering, when do we close fonts which are dropped 
from the FontCache? Maybe there’s an issue here…

— John

> 
> -- 
> George Sexton
> *MH Software, Inc.*
> Voice: 303 438 9585
> http://www.connectdaily.com


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3276) Double encryption dictionary for files with XRef stream

2016-03-19 Thread Patrick Stahle (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201931#comment-15201931
 ] 

Patrick Stahle commented on PDFBOX-3276:


Hi Tillman,

The workaround works.

Thanks.

> Double encryption dictionary for files with XRef stream
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.1, 2.1.0
>
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2213) NPE in PageDrawer.drawString

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2213:
---
Fix Version/s: 2.0.0

> NPE in PageDrawer.drawString
> 
>
> Key: PDFBOX-2213
> URL: https://issues.apache.org/jira/browse/PDFBOX-2213
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> File from PDFBOX-122
> java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage 
> PDFBOX122-UniJIS-UCS2-HW-H_sample.pdf 
> {code}
> Jul 22, 2014 9:38:55 PM org.apache.pdfbox.rendering.PageDrawer createAWTFont
> INFORMATION: Unsupported type of font 
> org.apache.pdfbox.pdmodel.font.PDType0Font
> Jul 22, 2014 9:38:55 PM org.apache.pdfbox.rendering.PageDrawer createAWTFont
> INFORMATION: Using font SansSerif.plain instead of ?l?r?¥?®
> Exception in thread "main" java.lang.NullPointerException
> at sun.font.StandardGlyphVector.(Unknown Source)
> at java.awt.Font.createGlyphVector(Unknown Source)
> at 
> org.apache.pdfbox.rendering.PageDrawer.drawString(PageDrawer.java:415)
> at 
> org.apache.pdfbox.rendering.PageDrawer.processGlyph(PageDrawer.java:331)
> at 
> org.apache.pdfbox.util.PDFStreamEngine.processText(PDFStreamEngine.java:503)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3275) Show glyph bounds in DrawPrintTextLocations

2016-03-19 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-3275:
---

 Summary: Show glyph bounds in DrawPrintTextLocations
 Key: PDFBOX-3275
 URL: https://issues.apache.org/jira/browse/PDFBOX-3275
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 2.0.0, 2.0.1, 2.1.0
Reporter: Tilman Hausherr


There have been repeated discussions about getting actual glyph bounds, but no 
code has been written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3276) Double encryption dictionary for files with XRef stream

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201780#comment-15201780
 ] 

ASF subversion and git services commented on PDFBOX-3276:
-

Commit 1735640 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1735640 ]

PDFBOX-3276: /Encrypt dictionary has already been written, so don't make it 
direct and write it a second time

> Double encryption dictionary for files with XRef stream
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-1926) Document.save() after Document.close() causes Null Pointer Exception

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-1926:
---
Fix Version/s: 2.0.0

> Document.save() after Document.close() causes Null Pointer Exception
> 
>
> Key: PDFBOX-1926
> URL: https://issues.apache.org/jira/browse/PDFBOX-1926
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
> Environment: Linux (Ubuntu)
>Reporter: Bastian Preindl
>Priority: Minor
> Fix For: 2.0.0
>
>
> Until version 1.8.3 it was possible to perform a document's save()-call also 
> if the document's close()-method has already been called. In my opinion this 
> made sense as I'd like to persist the document's content also if no more 
> (content changing) operations are performed on it.
> Anyway, in 1.8.4, calling save() after close() causes a NullPointerException:
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.pdfbox.pdmodel.PDDocument.getDocumentCatalog(PDDocument.java:765)
>   at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1346)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3273) Fonts not rendered correctly

2016-03-19 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3273:

Attachment: PDFBOX-3273.pdf

> Fonts not rendered correctly
> 
>
> Key: PDFBOX-3273
> URL: https://issues.apache.org/jira/browse/PDFBOX-3273
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Oracle Java8 - OS X Yosemite 10.10.2
>Reporter: Jacopo Pugliese
>  Labels: type1font
> Attachments: PDFBOX-3273.pdf
>
>
> Fonts are not correctly rendered when using PDFBOX 2.0.0RC3  to extract 
> images from a pdf.
> Here is the pdf I am using for testing and two examples of images (the 
> wrongly rendered characters are the red ones in the yellow box):
> https://drive.google.com/open?id=0B9ji30i4c2KmcndCZGYxbk5telE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2936) javax.crypto.BadPaddingException: Given final block not properly padded

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2936:
---
Fix Version/s: 1.8.11
   2.0.0

> javax.crypto.BadPaddingException: Given final block not properly padded
> ---
>
> Key: PDFBOX-2936
> URL: https://issues.apache.org/jira/browse/PDFBOX-2936
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Preflight
>Affects Versions: 1.8.10, 1.8.11, 2.0.0
> Environment: java version "1.8.0_25"
> Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
>Reporter: Daniel Woelfel
>Assignee: Tilman Hausherr
>  Labels: DHS
> Fix For: 1.8.11, 2.0.0
>
> Attachments: i-129.pdf
>
>
> I get the following stack trace when trying to parse certain pdfs:
> {noformat}
> org.apache.pdfbox.preflight.exception.SyntaxValidationException
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:203)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:180)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:168)
> at PDFBoxTest.main(hello.java:11)
> Caused by: org.apache.pdfbox.exceptions.WrappedIOException
> at 
> org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:376)
> at 
> org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptString(SecurityHandler.java:578)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.decryptString(NonSequentialPDFParser.java:1571)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.decryptDictionary(NonSequentialPDFParser.java:1535)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.decrypt(NonSequentialPDFParser.java:1596)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.parseObjectDynamically(PreflightParser.java:797)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1343)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.initialParse(PreflightParser.java:273)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:886)
> at 
> org.apache.pdfbox.preflight.parser.PreflightParser.parse(PreflightParser.java:198)
> ... 3 more
> Caused by: javax.crypto.BadPaddingException: Given final block not properly 
> padded
> at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:966)
> at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:824)
> at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:436)
> at javax.crypto.Cipher.doFinal(Cipher.java:2004)
> at 
> org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:352)
> ... 12 more
> {noformat}
> The parsing code looks something like:
> {noformat}
> FileDataSource fd = new FileDataSource("i-129.pdf");
> PreflightParser parser = new PreflightParser(fd);
> parser.parse();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3272) Loaded fonts file descriptors open after closing document

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202161#comment-15202161
 ] 

John Hewson commented on PDFBOX-3272:
-

Makes sense, close() does certainly need to actually close something.

> Loaded fonts file descriptors open after closing document
> -
>
> Key: PDFBOX-3272
> URL: https://issues.apache.org/jira/browse/PDFBOX-3272
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 1.8.11, 2.0.0
> Environment: Apache Tomcat, Linux
>Reporter: Gregor Ambrozic
>Assignee: Andreas Lehmkühler
> Fix For: 1.8.12, 2.0.1, 2.1.0
>
> Attachments: OpenSans-Regular.ttf
>
>
> I am experiencing problems with TTF fonts loaded for generating PDFs which 
> eventually result in too many open files on Linux. The PDFBox version I 
> tested last was 2.0.0-RC3.
> Basically for each PDF I create a document and load two fonts which I want to 
> use. After the document is generated I close all the resources, but the file 
> descriptors for both fonts remain open.
> The file descriptors should be automatically closed or an API should exist to 
> close font resources.
> My basic code:
> {code}
> import java.io.File;
> import java.io.IOException;
> import java.lang.management.ManagementFactory;
> import org.apache.commons.io.output.ByteArrayOutputStream;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDPage;
> import org.apache.pdfbox.pdmodel.PDPageContentStream;
> import org.apache.pdfbox.pdmodel.common.PDRectangle;
> import org.apache.pdfbox.pdmodel.font.PDFont;
> import org.apache.pdfbox.pdmodel.font.PDType0Font;
> public class FontTest
> {
>   // Run the program which will create 1 PDF document and close all 
> resources per second for 100 seconds.
>   // The font open file descriptor count will increase all the time, 
> until the program finishes.
>   // Command to check open files: lsof -p PID | grep ttf
>   public static void main(String[] args)
>   {
>   // should print out PID before @
>   System.out.println("process id: " + 
> ManagementFactory.getRuntimeMXBean().getName());
>   for (int i = 0; i < 100; i++)
>   {
>   createPDF();
>   try
>   {
>   Thread.sleep(1000);
>   }
>   catch (InterruptedException e)
>   {
>   e.printStackTrace();
>   }
>   }
>   }
>   private static void createPDF()
>   {
>   PDDocument doc = null;
>   PDPage page = null;
>   ByteArrayOutputStream bos = null;
>   try
>   {
>   doc = new PDDocument();
>   page = new PDPage(PDRectangle.A4);
>   doc.addPage(page);
>   // using standard font
>   PDFont font = PDType0Font.load(doc, new 
> File("./pdf/OpenSans-Regular.ttf"));
>   PDPageContentStream content = new 
> PDPageContentStream(doc, page);
>   content.beginText();
>   content.setFont(font, 72);
>   content.showText("OMG");
>   content.endText();
>   content.close();
>   bos = new ByteArrayOutputStream();
>   doc.save(bos);
>   byte[] bytes = bos.toByteArray();
>   System.out.println("create new pdf with size: " + 
> bytes.length);
>   }
>   catch (Exception e)
>   {
>   e.printStackTrace();
>   }
>   finally
>   {
>   try
>   {
>   doc.close();
>   bos.close();
>   }
>   catch (IOException e)
>   {
>   e.printStackTrace();
>   }
>   }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Open TrueType Fonts

2016-03-19 Thread George Sexton



On 3/18/2016 3:09 PM, John Hewson wrote:
When we do the scan, we close each font immediately afterwards. If 
PDFBox then goes on to use that font, we’ll re-open the file and then 
cache the new font. The cache is a map of SoftReference objects, 
which get released when the JVM feels like it. But now I’m wondering, 
when do we close fonts which are dropped from the FontCache? Maybe 
there’s an issue here… — John 



I just did lsof -p  and I have 148 TrueType fonts open.


--
George Sexton
*MH Software, Inc.*
Voice: 303 438 9585
http://www.connectdaily.com


[jira] [Created] (PDFBOX-3277) Unable to write Chinese characters into PDF

2016-03-19 Thread Tiruppathi Rajan Gunaseelan (JIRA)
Tiruppathi Rajan Gunaseelan created PDFBOX-3277:
---

 Summary: Unable to write Chinese characters into PDF
 Key: PDFBOX-3277
 URL: https://issues.apache.org/jira/browse/PDFBOX-3277
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel, Writing
Affects Versions: 2.0.0
 Environment: Windows, JDK7
Reporter: Tiruppathi Rajan Gunaseelan



Please refer to the JIRA 

https://issues.apache.org/jira/browse/PDFBOX-1705

I am unable to write Chinese content into a PDF using EmbeddedFonts.java from 
JIRA 1705. Later I found way to fix it by feeding in the "arialuni.ttf" instead 
of LiberationSans-Regular.ttf from the example. I had confirmed the Chinese PDF 
generation by writing a hardcoded Chinese data in my example. 

Later I had extended this piece of logic to write the desired PDF in my 
application but got into "java.lang.IllegalArgumentException: U+008A is not 
available in this font's encoding: WinAnsiEncoding". Please advise.

1. From my application, i call an external Webservice and that returns the 
Chinese + English content in response as a String,
2. I read this response string using UTF-8 as the Charset and unmarshal it to 
Java bean using JAXRB.
3. To the contentStream.drawString() i passed the string calling the getter 
method on the pojo which has Chinese data stored in a String variable. Here's 
where i am getting the above exception. Please help me to fix this.
I am using PDFBox version 2.0.0.-RC3. Your help is much appreciated.

Also the generated PDF file is on MBs, i believe this huge size is because of 
the feed of arialuni.ttf which is of size 23 MB.Is there a way we can reduce 
the size of the output PDF file. Do we have an alternate to arialuni.ttf? 
Please advise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3273) Fonts not rendered correctly

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-3273:
---
Fix Version/s: 2.1.0
   2.0.1

> Fonts not rendered correctly
> 
>
> Key: PDFBOX-3273
> URL: https://issues.apache.org/jira/browse/PDFBOX-3273
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Oracle Java8 - OS X Yosemite 10.10.2
>Reporter: Jacopo Pugliese
>  Labels: type1font
> Fix For: 2.0.1, 2.1.0
>
> Attachments: PDFBOX-3273.pdf
>
>
> Fonts are not correctly rendered when using PDFBOX 2.0.0RC3  to extract 
> images from a pdf.
> Here is the pdf I am using for testing and two examples of images (the 
> wrongly rendered characters are the red ones in the yellow box):
> https://drive.google.com/open?id=0B9ji30i4c2KmcndCZGYxbk5telE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2048) TextExtraction only working after uncompressing with pdftk

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2048:
---
Fix Version/s: 1.8.6
   2.0.0

> TextExtraction only working after uncompressing with pdftk
> --
>
> Key: PDFBOX-2048
> URL: https://issues.apache.org/jira/browse/PDFBOX-2048
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Rendering, Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.6, 2.0.0
>
>
> From Jonas Karlsson on the user list:
> ===
> We have a user with PDFs generated by a commercial transcription service.
> When we try to extract text from these pdfs, pdfbox returns a few empty
> lines. We get this result both from our own code, and when using the
> ExtractText command line tool
> If I specify the non-sequential parser, with the -nonSeq flag, the
> following error is produced:
> Apr 28, 2014 10:35:11 AM org.apache.pdfbox.pdfparser.NonSequentialPDFParser
> validateStreamLength
> SEVERE: The end of the stream doesn't point to the correct offset, using
> workaround to read the stream
> If I uncompress the file with pdftk, pdfbox is able to successfully extract
> the text.
> ===
> I have been given permission to attach the file "committers only". So don't 
> pass it around, avoid quoting details from the file. The file is also not 
> rendering. The lengths of the streams are 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3273) Fonts not rendered correctly

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198472#comment-15198472
 ] 

John Hewson commented on PDFBOX-3273:
-

Thinking about it, another approach could be to introduce a constant scaling 
factor so that we're basically using an integer fixed point representation. 
Then once the calculations are complete, we scale the path by the factor.

> Fonts not rendered correctly
> 
>
> Key: PDFBOX-3273
> URL: https://issues.apache.org/jira/browse/PDFBOX-3273
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Oracle Java8 - OS X Yosemite 10.10.2
>Reporter: Jacopo Pugliese
>  Labels: type1font
> Fix For: 2.0.1, 2.1.0
>
> Attachments: PDFBOX-3273.pdf
>
>
> Fonts are not correctly rendered when using PDFBOX 2.0.0RC3  to extract 
> images from a pdf.
> Here is the pdf I am using for testing and two examples of images (the 
> wrongly rendered characters are the red ones in the yellow box):
> https://drive.google.com/open?id=0B9ji30i4c2KmcndCZGYxbk5telE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3267) Using threads results in different images

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198498#comment-15198498
 ] 

John Hewson edited comment on PDFBOX-3267 at 3/17/16 12:48 AM:
---

Looking at the different images you posted, my first suspect would be glyph 
caching inside TTF fonts, because the fonts are shared between threads.


was (Author: jahewson):
Looking at the different images you posted, my first suspect would be glyph 
caching inside TTF fonts.

> Using threads results in different images
> -
>
> Key: PDFBOX-3267
> URL: https://issues.apache.org/jira/browse/PDFBOX-3267
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Attachments: ComparePDF.java, bmpwithdpi.pdf, bmpwithdpi2.pdf, 
> t1.png, t2.png
>
>
> If i dont use threads images are the same
> java -cp pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:. ComparePDF
> Exception in thread "main" java.io.IOException: Not equals
>   at ComparePDF.main(ComparePDF.java:21)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3272) Loaded fonts file descriptors open after closing document

2016-03-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198846#comment-15198846
 ] 

Andreas Lehmkühler commented on PDFBOX-3272:


I already added that missing close. The remaining issue seems to be a missing 
close in RAFDataStream#close, the provided file isn't closed when closing the 
stream. BUt I've to run some further tests to be sure.

> Loaded fonts file descriptors open after closing document
> -
>
> Key: PDFBOX-3272
> URL: https://issues.apache.org/jira/browse/PDFBOX-3272
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 1.8.11, 2.0.0
> Environment: Apache Tomcat, Linux
>Reporter: Gregor Ambrozic
>Assignee: Andreas Lehmkühler
> Fix For: 1.8.12, 2.0.1, 2.1.0
>
> Attachments: OpenSans-Regular.ttf
>
>
> I am experiencing problems with TTF fonts loaded for generating PDFs which 
> eventually result in too many open files on Linux. The PDFBox version I 
> tested last was 2.0.0-RC3.
> Basically for each PDF I create a document and load two fonts which I want to 
> use. After the document is generated I close all the resources, but the file 
> descriptors for both fonts remain open.
> The file descriptors should be automatically closed or an API should exist to 
> close font resources.
> My basic code:
> {code}
> import java.io.File;
> import java.io.IOException;
> import java.lang.management.ManagementFactory;
> import org.apache.commons.io.output.ByteArrayOutputStream;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDPage;
> import org.apache.pdfbox.pdmodel.PDPageContentStream;
> import org.apache.pdfbox.pdmodel.common.PDRectangle;
> import org.apache.pdfbox.pdmodel.font.PDFont;
> import org.apache.pdfbox.pdmodel.font.PDType0Font;
> public class FontTest
> {
>   // Run the program which will create 1 PDF document and close all 
> resources per second for 100 seconds.
>   // The font open file descriptor count will increase all the time, 
> until the program finishes.
>   // Command to check open files: lsof -p PID | grep ttf
>   public static void main(String[] args)
>   {
>   // should print out PID before @
>   System.out.println("process id: " + 
> ManagementFactory.getRuntimeMXBean().getName());
>   for (int i = 0; i < 100; i++)
>   {
>   createPDF();
>   try
>   {
>   Thread.sleep(1000);
>   }
>   catch (InterruptedException e)
>   {
>   e.printStackTrace();
>   }
>   }
>   }
>   private static void createPDF()
>   {
>   PDDocument doc = null;
>   PDPage page = null;
>   ByteArrayOutputStream bos = null;
>   try
>   {
>   doc = new PDDocument();
>   page = new PDPage(PDRectangle.A4);
>   doc.addPage(page);
>   // using standard font
>   PDFont font = PDType0Font.load(doc, new 
> File("./pdf/OpenSans-Regular.ttf"));
>   PDPageContentStream content = new 
> PDPageContentStream(doc, page);
>   content.beginText();
>   content.setFont(font, 72);
>   content.showText("OMG");
>   content.endText();
>   content.close();
>   bos = new ByteArrayOutputStream();
>   doc.save(bos);
>   byte[] bytes = bos.toByteArray();
>   System.out.println("create new pdf with size: " + 
> bytes.length);
>   }
>   catch (Exception e)
>   {
>   e.printStackTrace();
>   }
>   finally
>   {
>   try
>   {
>   doc.close();
>   bos.close();
>   }
>   catch (IOException e)
>   {
>   e.printStackTrace();
>   }
>   }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: GSoC 2016

2016-03-19 Thread John Hewson

> On 15 Mar 2016, at 13:18, Tilman Hausherr  wrote:
> 
> I don't have a good project idea for GSoC 2016, which is sad because I like 
> the concept.
> 
> Here some not-good project ideas:
> 
> - JPX converter - we really need that, but I don't have enough skills to be 
> able to mentor that. I had an idea who could mentor that and he would be 
> perfect, but he doesn't have the time :-(
> - Transparency groups - several skilled people have tried and have not fully 
> succeeded in this, so it could well be that a student tries and then fails 
> and everybody is unhappy.
> 
> Thus this year, I'll just relax and maybe do some stuff on my own.

We could do with a new text extractor, from the ground-up. 

— John

> Tilman
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-3277) Unable to write Chinese characters into PDF

2016-03-19 Thread Tiruppathi Rajan Gunaseelan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tiruppathi Rajan Gunaseelan closed PDFBOX-3277.
---
Resolution: Fixed

I just tried building the latest PDFBox Api source code and found the Chinese 
characters are properly written to the PDF also the output file is of KBs only 
now. Apologize for the inconvenience and closing the defect.

> Unable to write Chinese characters into PDF
> ---
>
> Key: PDFBOX-3277
> URL: https://issues.apache.org/jira/browse/PDFBOX-3277
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Writing
>Affects Versions: 2.0.0
> Environment: Windows, JDK7
>Reporter: Tiruppathi Rajan Gunaseelan
>
> Please refer to the JIRA 
> https://issues.apache.org/jira/browse/PDFBOX-1705
> I am unable to write Chinese content into a PDF using EmbeddedFonts.java from 
> JIRA 1705. Later I found way to fix it by feeding in the "arialuni.ttf" 
> instead of LiberationSans-Regular.ttf from the example. I had confirmed the 
> Chinese PDF generation by writing a hardcoded Chinese data in my example. 
> Later I had extended this piece of logic to write the desired PDF in my 
> application but got into "java.lang.IllegalArgumentException: U+008A is not 
> available in this font's encoding: WinAnsiEncoding". Please advise.
> 1. From my application, i call an external Webservice and that returns the 
> Chinese + English content in response as a String,
> 2. I read this response string using UTF-8 as the Charset and unmarshal it to 
> Java bean using JAXRB.
> 3. To the contentStream.drawString() i passed the string calling the getter 
> method on the pojo which has Chinese data stored in a String variable. Here's 
> where i am getting the above exception. Please help me to fix this.
> I am using PDFBox version 2.0.0.-RC3. Your help is much appreciated.
> Also the generated PDF file is on MBs, i believe this huge size is because of 
> the feed of arialuni.ttf which is of size 23 MB.Is there a way we can reduce 
> the size of the output PDF file. Do we have an alternate to arialuni.ttf? 
> Please advise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3275) Show glyph bounds in DrawPrintTextLocations

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202184#comment-15202184
 ] 

John Hewson commented on PDFBOX-3275:
-

Could you post 032431.pdf? I'm going to take a look at the PDVectorFont issue 
you mentioned in the comment.

> Show glyph bounds in DrawPrintTextLocations
> ---
>
> Key: PDFBOX-3275
> URL: https://issues.apache.org/jira/browse/PDFBOX-3275
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
>
> There have been repeated discussions about getting actual glyph bounds, but 
> no code has been written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3272) Loaded fonts file descriptors open after closing document

2016-03-19 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198494#comment-15198494
 ] 

John Hewson commented on PDFBOX-3272:
-

The stream can't be closed after the parsing, because TTF parsing is actually 
lazy, the TrueTypeFont holds on to the stream and can subsequently read further 
data from it. For that reason there's a TrueTypeFont#close() method which 
allows any streams held by the font to be closed once we're done with it. I 
suspect that we're missing a call to that method somewhere.


> Loaded fonts file descriptors open after closing document
> -
>
> Key: PDFBOX-3272
> URL: https://issues.apache.org/jira/browse/PDFBOX-3272
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 1.8.11, 2.0.0
> Environment: Apache Tomcat, Linux
>Reporter: Gregor Ambrozic
>Assignee: Andreas Lehmkühler
> Fix For: 1.8.12, 2.0.1, 2.1.0
>
> Attachments: OpenSans-Regular.ttf
>
>
> I am experiencing problems with TTF fonts loaded for generating PDFs which 
> eventually result in too many open files on Linux. The PDFBox version I 
> tested last was 2.0.0-RC3.
> Basically for each PDF I create a document and load two fonts which I want to 
> use. After the document is generated I close all the resources, but the file 
> descriptors for both fonts remain open.
> The file descriptors should be automatically closed or an API should exist to 
> close font resources.
> My basic code:
> {code}
> import java.io.File;
> import java.io.IOException;
> import java.lang.management.ManagementFactory;
> import org.apache.commons.io.output.ByteArrayOutputStream;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDPage;
> import org.apache.pdfbox.pdmodel.PDPageContentStream;
> import org.apache.pdfbox.pdmodel.common.PDRectangle;
> import org.apache.pdfbox.pdmodel.font.PDFont;
> import org.apache.pdfbox.pdmodel.font.PDType0Font;
> public class FontTest
> {
>   // Run the program which will create 1 PDF document and close all 
> resources per second for 100 seconds.
>   // The font open file descriptor count will increase all the time, 
> until the program finishes.
>   // Command to check open files: lsof -p PID | grep ttf
>   public static void main(String[] args)
>   {
>   // should print out PID before @
>   System.out.println("process id: " + 
> ManagementFactory.getRuntimeMXBean().getName());
>   for (int i = 0; i < 100; i++)
>   {
>   createPDF();
>   try
>   {
>   Thread.sleep(1000);
>   }
>   catch (InterruptedException e)
>   {
>   e.printStackTrace();
>   }
>   }
>   }
>   private static void createPDF()
>   {
>   PDDocument doc = null;
>   PDPage page = null;
>   ByteArrayOutputStream bos = null;
>   try
>   {
>   doc = new PDDocument();
>   page = new PDPage(PDRectangle.A4);
>   doc.addPage(page);
>   // using standard font
>   PDFont font = PDType0Font.load(doc, new 
> File("./pdf/OpenSans-Regular.ttf"));
>   PDPageContentStream content = new 
> PDPageContentStream(doc, page);
>   content.beginText();
>   content.setFont(font, 72);
>   content.showText("OMG");
>   content.endText();
>   content.close();
>   bos = new ByteArrayOutputStream();
>   doc.save(bos);
>   byte[] bytes = bos.toByteArray();
>   System.out.println("create new pdf with size: " + 
> bytes.length);
>   }
>   catch (Exception e)
>   {
>   e.printStackTrace();
>   }
>   finally
>   {
>   try
>   {
>   doc.close();
>   bos.close();
>   }
>   catch (IOException e)
>   {
>   e.printStackTrace();
>   }
>   }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3276) Double encryption dictionary for files with XRef stream

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201945#comment-15201945
 ] 

ASF subversion and git services commented on PDFBOX-3276:
-

Commit 1735656 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1735656 ]

PDFBOX-3276: refactor double code

> Double encryption dictionary for files with XRef stream
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.1, 2.1.0
>
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197860#comment-15197860
 ] 

Tilman Hausherr commented on PDFBOX-3274:
-

This is only relevant for text extraction (it is another flaw in your PDF), but 
that code segment is also called by rendering. If you want, you can set pdfbox, 
or just org.apache.pdfbox.pdmodel.font.PDType0Font so that it only reports 
errors. Here's a line for log4j:
{code}
log4j.logger.org.apache.pdfbox.pdmodel.font.PDType0Font=ERROR
{code}


> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread Arthur Vuijk (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198018#comment-15198018
 ] 

Arthur Vuijk commented on PDFBOX-3274:
--

Do not agree that the issue is Not A Problem. 2.0 is RC and the statement that 
PDFBox can handle 'any' PDF that passes AdobeReader is in this case not true. 
For us implementing 2.0 requires an extensive amount of recoding simply because 
1.8 can't handle a font? Despite all your good intentions and voluntary work 
I'm quite disappointed and would really like a fix on 1.8 to prevent several 
hours of recoding and testing.

I understanding closing the issue due to the fact there is a workaround by 
implementing 2.0 but the bug remains...

> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2161) A PDRadioButton with no children throws an NPE

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2161:
---
Fix Version/s: 1.8.7
   2.0.0

> A PDRadioButton with no children throws an NPE
> --
>
> Key: PDFBOX-2161
> URL: https://issues.apache.org/jira/browse/PDFBOX-2161
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.6
>Reporter: Tim Allison
>Priority: Minor
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBox-2161.patch, PDFBox-2161.patch
>
>
> In some pdfs, a PDRadioButton can not have any children.  This leads to an 
> NPE when a client calls getValue().  The javadocs say that getValue() can 
> throw an IOException if there is a problem getting the value, but not an NPE.
> A doc that shows this issue is:
> [562254|http://digitalcorpora.org/corp/nps/files/govdocs1/562/562254.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.0

2016-03-19 Thread John Hewson

> On 16 Mar 2016, at 11:56, Andreas Lehmkuehler  wrote:
> 
> Hi,
> 
> Am 14.03.2016 um 18:35 schrieb Andreas Lehmkuehler:
>> Please vote on releasing this package as Apache PDFBox 2.0.0.
>> The vote is open for the next 72 hours and passes if a majority of at
>> least three +1 PDFBox PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache PDFBox 2.0.0
>> [ ] -1 Do not release this package because...
> 
> Just a friendly reminder, there are round about 23 hours left to check the 
> release and to cast your vote.
> 

+1 works for me! I’m confident that we can fix that outstanding issue later, I 
think it’s simply a missing .close() call.

> TIA
> Andreas
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Open TrueType Fonts

2016-03-19 Thread Tilman Hausherr

 https://issues.apache.org/jira/browse/PDFBOX-3272

Tilman




Am 17.03.2016 um 21:29 schrieb George Sexton:
I was looking through things on my server and I'm seeing that my 
application has about 130 TrueType font files open. I'm running this 
under Tomcat so the process is persistent.


I've dug through the source code a little bit and I can see that 
AbstractTTFParser.parseTTF(File ttfFile) explicitly doesn't close the 
stream. I'm guessing this is because there's a way to do deferred 
parsing of the file.


I looked at o.a.fontbox.util.FontManager as well, and it looks like 
when it gets created, it's loading all available truetype fonts and at 
least getting the Font Names table to map the font names to the 
constructed TrueTypeFont objects.


Is there ever a point when the file input stream could be closed? Is 
there some operation performed on TrueTypeFont that would definitively 
conclude access to the stream?







-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Release preparations

2016-03-19 Thread Andreas Lehmkuehler

Hi,

Am 14.03.2016 um 19:41 schrieb Andreas Lehmkuehler:

Hi,

I've done some configuration due to the upcoming final release:

- new 2.0 branch as a copy of the current trunk (rev. r1734960)
- the trunk now uses 2.1.0-SNAPSHOT
- the branch uses 2.0.1-SNAPSHOT
- I've created a new version 2.0.1 in JIRA
- I've created a new jenkins build for the 2.0 branch


I've closed all 2.0.0 related JIRA tickets without email notification and 
updated/closed some resolved tickets without "Fix version" as well. Version 
"2.0.0" was set to released in JIRA.


The download page was updated as well as the doap file.

BR
Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2442) false negative? 3.1.6 : Invalid Font definition, Width (633.0) of the character "60" in the font program "BNGLNN+LucidaMath-Symbol" is inconsistent with the width (0.0)

2016-03-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2442:
---
Fix Version/s: 2.0.0

> false negative? 3.1.6 : Invalid Font definition, Width (633.0) of the 
> character "60" in the font program "BNGLNN+LucidaMath-Symbol" is inconsistent 
> with the width (0.0) in the PDF dictionary.
> ---
>
> Key: PDFBOX-2442
> URL: https://issues.apache.org/jira/browse/PDFBOX-2442
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
> Environment: java7 deb7
>Reporter: Ralf Hauser
>Assignee: John Hewson
> Fix For: 2.0.0
>
> Attachments: adobe7pie.pdf
>
>
> org.apache.pdfbox.preflight.font.util.GlyphException: Width (633.0) of the 
> character "60" in the font program "BNGLNN+LucidaMath-Symbol" is inconsistent 
> with the width (0.0) in the PDF dictionary.
>   at 
> org.apache.pdfbox.preflight.font.container.FontContainer.checkWidthsConsistency(FontContainer.java:181)
>   at 
> org.apache.pdfbox.preflight.font.container.FontContainer.checkGlyphWidth(FontContainer.java:130)
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validText(PreflightContentStream.java:342)
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validStringArray(PreflightContentStream.java:276)
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validStringArray(PreflightContentStream.java:272)
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.checkShowTextOperators(PreflightContentStream.java:190)
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.processOperator(PreflightContentStream.java:155)
>   at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processSubStream(PDFStreamEngine.java:226)
>   at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processSubStream(PDFStreamEngine.java:196)
>   at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:152)
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validPageContentStream(PreflightContentStream.java:76)
>   at 
> org.apache.pdfbox.preflight.process.reflect.SinglePageValidationProcess.validateContent(SinglePageValidationProcess.java:184)
>   at 
> org.apache.pdfbox.preflight.process.reflect.SinglePageValidationProcess.validate(SinglePageValidationProcess.java:87)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:73)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:52)
>   at 
> org.apache.pdfbox.preflight.process.PageTreeValidationProcess.validatePage(PageTreeValidationProcess.java:56)
>   at 
> org.apache.pdfbox.preflight.process.PageTreeValidationProcess.validate(PageTreeValidationProcess.java:45)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:73)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:88)
>   at 
> org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:168)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3274) Unreadable PDF print

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198039#comment-15198039
 ] 

Tilman Hausherr commented on PDFBOX-3274:
-

It's not that easy. Getting PDFBox to handle many advanced font types properly 
resulted in a code redesign and many changes that took over a year can't be 
applied to 1.8. I'm not intending to offend you with "not a problem", I could 
of course also have used "won't fix", but the result is the same. Btw the RC is 
going to be released soon.

> Unreadable PDF print
> 
>
> Key: PDFBOX-3274
> URL: https://issues.apache.org/jira/browse/PDFBOX-3274
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.11
> Environment: Windows (7 and 10), Eclipse Mars
>Reporter: Arthur Vuijk
> Attachments: 1062542195549019.pdf
>
>
> Newby to PDFBox!!
> {code}
> public static void main(String[] args) { 
> printername = "HP Photosmart 6510 series";
>   PDDocument document = null;
>   try
>   {
> PrinterJob Printjob4 = PrinterJob.getPrinterJob();
> document = PDDocument.load ("E:\\1062542195549019.pdf");
> PrintService[] printService = PrinterJob.lookupPrintServices();
> boolean printerfound=false;
> for (int i = 0; !printerfound && i < printService.length; i++)
> {
> if(printService[i].getName().indexOf(printername)!= -1)
> {
> Printjob4.setPrintService(printService[i]);
> printerfound = true;
> }
> }
> document.silentPrint( Printjob4 );
> document.close();
>   } catch (IOException e)
>   {   System.out.println("Fout bij printen");
>   
>   } catch (PrinterException e) {
> System.out.println("Fout bij printen");
>   e.printStackTrace();
>   }
>}
> }
> {code}
> Returns:
> {code}
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:25 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: BDC
> mrt 15, 2016 10:01:26 PM org.apache.pdfbox.util.PDFStreamEngine 
> processOperator
> INFO: unsupported/disabled operation: EMC
> {code}
> PDF File contains CID font types!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3276) Double encryption dictionary for files with XRef stream

2016-03-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201935#comment-15201935
 ] 

Tilman Hausherr commented on PDFBOX-3276:
-

But make sure that the fix also works, and if not, please reopen the issue :-) 
It should be available in maven, or here within a few hours
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.1-SNAPSHOT/

> Double encryption dictionary for files with XRef stream
> ---
>
> Key: PDFBOX-3276
> URL: https://issues.apache.org/jira/browse/PDFBOX-3276
> Project: PDFBox
>  Issue Type: Bug
>  Components: Crypto, Writing
>Affects Versions: 2.0.0, 2.0.1, 2.1.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.1, 2.1.0
>
> Attachments: annots-encrypted.pdf, annots.pdf
>
>
> This was first mentioned by Patrick S. in the mailing list:
> {quote}
> This is not a general problem and only occurs with original PDF generated 
> with 3D content using Anark. The file when loaded seems to have encrypted and 
> loads just find in Adobe Reader, but when we try to do a "Save As" we get the 
> following error:
> "The document could not be saved. There was a problem reading this document 
> 21."
> If I do a control click on the "ok" button. I get the following message:
> "This direct object already has a container."
> {quote}
> I can reproduce the effect with the attached file by using the Encrypt 
> command line tool. A look at the file shows a double dictionary:
> {code}
> 593 0 obj
> <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> endobj
> 594 0 obj
> <<
> /ID [<1D7A1969B33886DCF0DD4B0176F149AF> ]
> /Info 7 0 R
> /Root 1 0 R
> /Encrypt <<
> /Filter /Standard
> /V 1
> /R 3
> /Length 40
> /P -4
> /O <10780080A0085854C58A57FCAFBD94A3CA3F7DF6FFE9DBC4834B7AAF144602C9>
> /U <7CF00AD61911DB6A737867655ED3520C28BF4E5E4E758A4164004E56FFFA0108>
> >>
> {code}
> I don't know if this is the cause, but it doesn't belong there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Open TrueType Fonts

2016-03-19 Thread George Sexton
I was looking through things on my server and I'm seeing that my 
application has about 130 TrueType font files open. I'm running this 
under Tomcat so the process is persistent.


I've dug through the source code a little bit and I can see that 
AbstractTTFParser.parseTTF(File ttfFile) explicitly doesn't close the 
stream. I'm guessing this is because there's a way to do deferred 
parsing of the file.


I looked at o.a.fontbox.util.FontManager as well, and it looks like when 
it gets created, it's loading all available truetype fonts and at least 
getting the Font Names table to map the font names to the constructed 
TrueTypeFont objects.


Is there ever a point when the file input stream could be closed? Is 
there some operation performed on TrueTypeFont that would definitively 
conclude access to the stream?




--
George Sexton
*MH Software, Inc.*
Voice: 303 438 9585
http://www.connectdaily.com