[ 
https://issues.apache.org/jira/browse/PDFBOX-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075498#comment-18075498
 ] 

Maruan Sahyoun commented on PDFBOX-6162:
----------------------------------------

I gave it several tries without good progress. IMHO it's really difficult to 
test the complete processing as e.g. {{GenericRefinementRegion.java}} 
implements the Segment but also the Procedure where the Procedure is also being 
used e.g. from TextRegion. Coupled with various instance variables it makes it 
difficult to control and test if {{cx}} and {{ArithmeticDecoder}} are being 
reused when the should or shouldn't. In general this is is possible with the 
current code by either initiating the class and also set parameters or not but 
it's not obvious.

What I'm proposing is to decouple the {{Segment}} from the {{Procedure}} with 
the procedure having no ownership of cx and decoder. This way it can be cleanly 
used from all relevant segments,  improves modularity, reusability, and 
clarity, and aligns with the spec’s separation of the refinement procedure from 
segment logic. It also makes state management explicit and avoids hidden 
dependencies, reducing the risk of bugs.

We can do this without affecting the current public API. I'm looking to move 
forward with GenericRefinementRegion first decoupling the other 
Segments/Procedures later.

Thoughts?

> Reuse of symbol context not properly supported
> ----------------------------------------------
>
>                 Key: PDFBOX-6162
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-6162
>             Project: PDFBox
>          Issue Type: Sub-task
>          Components: JBIG2
>    Affects Versions: 3.0.4 JBIG2
>            Reporter: Tilman Hausherr
>            Priority: Major
>         Attachments: bitmap-symbol-context-reuse.pdf
>
>
> .ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
>       at 
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.getToExportFlags(SymbolDictionary.java:898)
>       at 
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.getDictionary(SymbolDictionary.java:467)
>       at 
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.retrieveImportSymbols(SymbolDictionary.java:990)
>       at 
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.setInSyms(SymbolDictionary.java:267)
>       at 
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.parseHeader(SymbolDictionary.java:130)
>       at 
> org.apache.pdfbox.jbig2.segments.SymbolDictionary.init(SymbolDictionary.java:1025)
>       at 
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:380)
> Considering the name of the file, I assume this means we don't support the 
> reuse of symbols correctly. The file has at least 4 different symbol 
> segments. From what I see on
> https://github.com/SerenityOS/serenity/blob/master/Tests/LibGfx/test-inputs/jbig2/json/bitmap-symbol-context-reuse.json
> the text segment refers to the symbols of the 4 previous symbol segments, and 
> the symbol segments indicate some logic to retain the symbols of previous 
> segments, so one will have to investigate what happens.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to