Hi Dorian,

 
I'd suggest starting with the RemoveAllText.java example to see the basic 
pattern for filtering items from the PDF token stream.

 
What should work is to adapt this example to remove the "Do" operator and 
operands where the corresponding PDXObject is an instance of PDImageXObject.

 
This will remove raster images but if you've got line art on the page, that 
will remain.

 
John

-----Original message-----
From: Dorian Messina
Sent: Thursday, January 10 2019, 5:41 am
To: users@pdfbox.apache.org
Subject: Question about a feature
 
Hello,
First : thank you for PDFBox and all the time you pass working on it, to make 
our dev lives easier.

I use for the first time the library and I have one < how to > question.

I need to remove all pictures from a selectable pdf (I can select the text with 
the mouse).
Solutions exist on stackoverflow 
https://stackoverflow.com/questions/6831194/how-can-i-remove-all-images-drawings-from-a-pdf-file-and-leave-text-only-in-java
 and elsewhere but the code is old and refers to nonexistent
methods nowadays. Indeed, I am not able to find this miraculous method :


resources.getImages().clear();

Does this feature still exist ? Is there a simple way to fullfill my objective ?

Thank you

Happy new year

Dorian Messina
Analyst-Developer
Mobile : +32 493 02 63 57
d.mess...@wavenet.be <mailto:d.mess...@wavenet.be> <mailto:d.mess...@wavenet.be 
<mailto:d.mess...@wavenet.be> >

Wavenet
Rue de l'artisanat, 16
7900 Leuze-en-Hainaut | Belgique
Tel : +32 69 67 03 35
www.wavenet.be <http://www.wavenet.be> <http://www.wavenet.be 
<http://www.wavenet.be> />





Reply via email to