[
https://issues.apache.org/jira/browse/PDFBOX-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075498#comment-18075498
]
Maruan Sahyoun commented on PDFBOX-6162:
----------------------------------------
I gave it several tries without good progress. IMHO it's really difficult to
test the complete processing as e.g. {{GenericRefinementRegion.java}}
implements the Segment but also the Procedure where the Procedure is also being
used e.g. from TextRegion. Coupled with various instance variables it makes it
difficult to control and test if {{cx}} and {{ArithmeticDecoder}} are being
reused when the should or shouldn't. In general this is is possible with the
current code by either initiating the class and also set parameters or not but
it's not obvious.
What I'm proposing is to decouple the {{Segment}} from the {{Procedure}} with
the procedure having no ownership of cx and decoder. This way it can be cleanly
used from all relevant segments, improves modularity, reusability, and
clarity, and aligns with the spec’s separation of the refinement procedure from
segment logic. It also makes state management explicit and avoids hidden
dependencies, reducing the risk of bugs.
We can do this without affecting the current public API. I'm looking to move
forward with GenericRefinementRegion first decoupling the other
Segments/Procedures later.
Thoughts?
> Reuse of symbol context not properly supported
> ----------------------------------------------
>
> Key: PDFBOX-6162
> URL: https://issues.apache.org/jira/browse/PDFBOX-6162
> Project: PDFBox
> Issue Type: Sub-task
> Components: JBIG2
> Affects Versions: 3.0.4 JBIG2
> Reporter: Tilman Hausherr
> Priority: Major
> Attachments: bitmap-symbol-context-reuse.pdf
>
>
> .ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
> at
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.getToExportFlags(SymbolDictionary.java:898)
> at
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.getDictionary(SymbolDictionary.java:467)
> at
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.retrieveImportSymbols(SymbolDictionary.java:990)
> at
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.setInSyms(SymbolDictionary.java:267)
> at
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.parseHeader(SymbolDictionary.java:130)
> at
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.init(SymbolDictionary.java:1025)
> at
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:380)
> Considering the name of the file, I assume this means we don't support the
> reuse of symbols correctly. The file has at least 4 different symbol
> segments. From what I see on
> https://github.com/SerenityOS/serenity/blob/master/Tests/LibGfx/test-inputs/jbig2/json/bitmap-symbol-context-reuse.json
> the text segment refers to the symbols of the 4 previous symbol segments, and
> the symbol segments indicate some logic to retain the symbols of previous
> segments, so one will have to investigate what happens.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]