[ 
https://issues.apache.org/jira/browse/PDFBOX-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713746#comment-17713746
 ] 

Moritz Flöter edited comment on PDFBOX-5584 at 4/19/23 6:00 PM:
----------------------------------------------------------------

I do get that point. Depending on how deep the plugins are allowed to integrate 
into the Debugger, that could complicate future development.

A lightweight approach that doesn't add much hindrance could be to only allow 
the following:
 * Pass the PDF as InputStream to the plugin, allow for an OutputStream to come 
back
 ** Plugins that do not take an OutputStream as input and only work on the 
InputStream cannot modify the document and are considered analysis plugins 
(e.g. validation through pdfcpu, ghostscript, perhaps also stuff like image 
extraction, rendering to images etc.)
_performAnalysis(InputStream input)_
 ** Plugins that write their result (the modified PDF file) to an OutputStream 
do modify the document and are considered editing plugins (e.g., removing all 
text, removing all images, removing pages, moving printboxes)
_performPdfModification(InputStream input, OutputStream output)_
 ** After the execution of an editing plugin, PDFDebugger opens the document 
received through the OutputStream and loads it into a PDDocument that is 
displayed in the GUI
 * Plugins are responsible for creating their own dialogs if parameter input is 
needed and are responsible for displaying analysis results
 * Plugins register as menu entry (similar to the screenshot)
 * As long as no editing plugin has been used after opening the file, the input 
stream passed to an analysis plugin should always be that of the original file. 
If an editing plugin has been used, the PDDocument gets serialized and passed 
to the plugin through the InputStream.

Adding stuff like context menu entries in the tree view depending on the 
selected object would certainly be nice in terms of functionality, but they 
would indeed introduce overhead for future development.


was (Author: moritzf):
I do get that point. Depending on how deep the plugins are allowed to integrate 
into the Debugger, that could complicate future development.

A lightweight approach that doesn't add much hindrance could be to only allow 
the following:
 * Pass the PDF as InputStream to the plugin, allow for an OutputStream to come 
back
 ** Plugins that do not return an OutputStream cannot modify the document and 
are considered analysis plugins (e.g. validation through pdfcpu, ghostscript, 
perhaps also stuff like image extraction, rendering to images etc.)
 ** Plugins that return an OutputStream do modify the document and are 
considered editing plugins (e.g., removing all text, removing all images, 
removing pages, moving printboxes)
 ** After the execution of an editing plugin, PDFDebugger opens the 
OutputStream and loads it into a PDDocument that is displayed in the GUI
 * Plugins are responsible for creating their own dialogs if parameter input is 
needed
 * Plugins register as menu entry (similar to the screenshot)
 * As long as no editing plugin has been used after opening the file, the input 
stream passed to an analysis plugin should always be that of the original file. 
If an editing plugin has been used, the PDDocument gets serialized and passed 
to the plugin through the InputStream.

Adding stuff like context menu entries in the tree view depending on the 
selected object would certainly be nice in terms of functionality, but they 
would indeed introduce overhead for future development.

> Plugins for PDFDebugger
> -----------------------
>
>                 Key: PDFBOX-5584
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5584
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Moritz Flöter
>            Priority: Minor
>         Attachments: 2023-04-12_09-00-01_explorer_bmps4tqOTT.png
>
>
> The PDFBox Debugger is a great tool for analyzing PDF documents due to its 
> functionality and  licence.
> However, it is constrained to what PDFBox itself can do. We extended the 
> Debugger to accomplish some of the more frequent tasks needed for processing 
> service tickets for our own software product.
> !2023-04-12_09-00-01_explorer_bmps4tqOTT.png!
> Some of the extended functionality relies on our proprietary PDF processing 
> (this is completely separate from PDFBox) but other features rely on 
> Implementations around PDFBOX functionality (such as drawing PrintBoxes or 
> moving them, removing document security attributes for subsequent analysis in 
> other tools, removing all text to get rid of sensitive data etc.).
> There is also functionality that relies on Java-Libraries like VeraPDF, 
> OpenPDF or even calls to external command line tools like ghostscript and 
> pdfcpu (the latter with bundled binaries, the former without because of GPL).
> We would very much like to publish and contribute Plugins for the Debugger 
> but as of now, everything is based on a direct extension (even using some 
> Reflection) of the PDFDebugger class and thus can not be made available in 
> source for public (as it also relies on our proprietary PDF code). 
> Furthermore, dependencies to external software or third party PDF libraries 
> really should not be directly integrated in the main PDFBox repositories and 
> therefore I wouldn't know how to contribute back to the PDFBox project.
> I do however not see any harm in having such dependencies for plugins that 
> are provided in different repositories and possibly developed and maintained 
> by other developers. So I do see a benefit in having a Plugin-Interface in 
> PDFDebugger.
> Most likely the group of people making and using such plugins is rather small 
> but I still wanted to run the idea by you in case you are interested.
> I am also willing to work on this feature if I am provided with some input as 
> to what you expect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to