[ 
https://issues.apache.org/jira/browse/PDFBOX-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865403#comment-16865403
 ] 

Timo Boehme commented on PDFBOX-4559:
-------------------------------------

I think we have to explore different levels of creating/using streams in regard 
to be thread safe. The base implementation for out memory paging - ScratchFile 
- is (as the Javadoc states) thread safe (at least was meant to be it :) ). 
However the RandomAccess instances (ScratchFileBuffer) created from it are not 
- as we have possibilities of mixed reads and writes (and so far parallel 
access to an instance was not supported by the API). RandomAccessInputStream is 
only a small layer on top of RandomAccessRead - here as ScratchFileBuffer. The 
first step would be to switch the ScratchFileBuffer in a read-only mode (or 
have a small wrapper only allowing thread-safe read access, implementing 
RandomAccessRead).

However even this might not help in this case as using a single 
RandomAccessInputStream from multiple threads will be go wrong (even if the 
methods would be synchronized) as one thread would not see a sequential stream 
of input bytes but the other threads will read some bytes in between.

For thread safe access the RandomAccessInputStream has to be created on request 
of a specific thread and method which wants to read the data. Thus the 
COSInputStream would have to store the thread safe RandomAccessRead 
implementation (as it does so indirectly now for the ScratchFileBuffer 
underlying the RandomAccessInputStream) and would have a method for creating a 
RandomAccessInputStream each time it is needed (beeing only a small access 
wrapper for the data).

 

> Parse error reading document from several threads
> -------------------------------------------------
>
>                 Key: PDFBOX-4559
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4559
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Documentation, Rendering
>    Affects Versions: 2.0.15
>         Environment: Oracle Java 8 update125 on both Mac OS X and centos
>            Reporter: Jack
>            Priority: Major
>              Labels: concurrency, multithreading, type1, type1font
>         Attachments: test.pdf
>
>
> I got following error while running a simple parallel rendering code. 
> However, the error doesn't happen when I change parallelStream to sequential 
> (stream()). Interestingly, both methods will render exact same images. I saw 
> a possible related ticket PDFBOX-3654. But seems that issue was fixed. I'd 
> like to learn if we have some more bugs related?  
> *Sample code*:
> {code:java}
> PDDocument document = PDDocument.load(new File(pdfFilename));
> List<PDDocument> pdfPages = new Splitter().split(document);
> pdfPages.parallelStream().forEach(page -> {
>  try {
> PDFRenderer renderer = new PDFRenderer(page);
> renderer.renderImageWithDPI(0, 180, ImageType.RGB); // change dpi to your 
> number
> } catch (IOException e) {
>  System.out.println(e);
> }
> try {
>  pdfPage.close();
> } catch (IOException ignored) {
> }
> });
> try {
>  document.close();
> } catch (IOException ignored) {
> }
> {code}
>  
> *Error log*:
> {noformat}
> ERROR [PDType1Font] Can't read the embedded Type1 font POAEND+Gotham-Book
> java.io.IOException: unexpected closing parenthesis
>  at org.apache.fontbox.type1.Type1Lexer.readToken(Type1Lexer.java:123) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Lexer.nextToken(Type1Lexer.java:75) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.readValue(Type1Parser.java:398) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.readOtherSubrs(Type1Parser.java:707) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.parseBinary(Type1Parser.java:550) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:64) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:85) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:262) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:62)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:869)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:505)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:479)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:152)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:265) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:314) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243) 
> ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
>  at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:229)
>  ~[pdfbox-2.0.15-snapshot108.jar:2.0.15-SNAPSHOT]
> WARN [PDType1Font] Using fallback font Helvetica for POAEND+Gotham-Book
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to