[
https://issues.apache.org/jira/browse/PDFBOX-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983120#comment-14983120
]
John Hewson commented on PDFBOX-3074:
-------------------------------------
We don't want to store housekeeping data in PDMarkedContent. The PD model is
there to expose the underlying PDF objects, but transparency groups and marked
content are unrelated, so that's not something we'd want in the API. Thanks all
the same.
> Mark transparency groups
> ------------------------
>
> Key: PDFBOX-3074
> URL: https://issues.apache.org/jira/browse/PDFBOX-3074
> Project: PDFBox
> Issue Type: New Feature
> Components: Text extraction
> Affects Versions: 2.0.0
> Reporter: Daniel Persson
> Priority: Minor
> Labels: github-import
> Fix For: 2.0.0
>
> Attachments: mark_transparency_groups.patch
>
>
> We try to read text from PDF files but some of the files include extra data
> that is never shown. These segments are usually grouped in transparency
> groups. So for us this function to flag a marked content as a transparency
> group is quite useful.
> If there is a way to do this please tell me or if there is a better way to
> remove text that isn't presented or drawn when the PDF is viewed then I'm all
> ears.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]