[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154684#comment-15154684
 ] 

Tilman Hausherr commented on PDFBOX-3030:
-----------------------------------------

2.0 for the FAQ or the migration guide:

Why was the ReplaceText example removed?

Because it gave the incorrect illusion that text can be replaced easily. Words 
are often split, as seen by this excerpt of a content stream:
{code}
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
{code}

Other problems will appear with font subsets: for example, if only the glyphs 
for a, b and c are used, these would be encoded as hex 0, 1 and 2, so you won't 
find "abc". Additionally, you can't replace "c" with "d" because it isn't part 
of the subset.

You could also have problems with ligatures, e.g. "ff", "fl", "fi", "ffi", 
"ffl", which can be represented by a single code in many fonts.

To understand this yourself, view any file with PDFDebugger and have a look at 
the "Contents" entry of a page.

> Enhance documentation for PDFBox 2.0.0
> --------------------------------------
>
>                 Key: PDFBOX-3030
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3030
>             Project: PDFBox
>          Issue Type: Task
>          Components: Documentation
>    Affects Versions: 2.0.0
>            Reporter: Maruan Sahyoun
>            Assignee: Maruan Sahyoun
>         Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to