[
https://issues.apache.org/jira/browse/PDFBOX-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388218#comment-17388218
]
Tilman Hausherr commented on PDFBOX-5247:
-----------------------------------------
It's tricky but you could use pdfbox-app and run WriteDecodedDoc. Then edit
that file with a hex editor and replace text with blanks. However it's possible
that the interesting text isn't visible as such so you won't be able to find
it, or that it is in an image.
> Space in pdf returns c2 a0 characters instead of 20
> ---------------------------------------------------
>
> Key: PDFBOX-5247
> URL: https://issues.apache.org/jira/browse/PDFBOX-5247
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Environment: Portfolio Performance
> Version: 0.54.0 (Jul. 2021)
> Platform: win32, x86_64
> Java: 11.0.4+11-LTS, Azul Systems, Inc.
> Locale: AU
> Reporter: flywire
> Priority: Minor
>
> *pdf containing:*
> SelfWealth Limited ABN: 52 154 324 428 AFSL 421789 W: www.selfwealth.com.au
> E: [email protected]
> This trade was executed and cleared by OpenMarkets Australia Ltd ABN 38 090
> 472 012,
> AFSL 246 705, Market Particpant of ASX, CHIX and NSX.
> Buy Confirmation
>
> *Gives (see hex on right side):*
> !https://user-images.githubusercontent.com/11288701/126945391-18c0ccb4-289d-49cd-85a8-8714e145df3f.png!
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]