You may be in luck!  I had a similar problem with missing characters that 
weren’t really missing and I just figured it out last night.  I found this 
answer on StackOverflow that describes in detail the challenge of text 
replacement in a pre-existing PDF.

http://stackoverflow.com/questions/15964704/java-pdfbox-reading-and-modifying-a-pdf-with-special-characters-diacritics

See the answer by Plinth.  Basically what I found in mine was that any 
character that had not been previously used within the PDF when it was rendered 
disappeared.  Reading his post made me realize that only a subset of the font 
was being included within the embedded font in the file.  I ended up just 
adding a junk line with all of the characters to my file during rendering to 
test this, and it cleared up the problem.  The color and size of the line don’t 
seem to matter, it is just whether or not the rendering decides if the 
character is needed within the subset or not.  Hope this helps!

Thanks...Mike

From: Steffen R. [mailto:[email protected]]
Sent: Wednesday, June 12, 2013 4:18 AM
To: [email protected]
Subject: Problem with Unicode text in PDF form text field

Hello,
I am facing a problem that might be a bug. This is the scenario: Loading a PDF, 
filling in some form text fields and saving it back to PDF. When I do this

PDDocument doc = null;
        try
        {
            doc = PDDocument.load( "Test.pdf" );

            PDAcroForm form = doc.getDocumentCatalog().getAcroForm();
            PDVariableText field = (PDVariableText) form.getField("testField");
            field.setValue("Test it 123456789012345 äüö?ß! á Ф ф Й й άγγελος");

            doc.save( "TestFilled.pdf" );
        } catch (COSVisitorException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        finally
        {
            if( doc != null )
            {
                try {
                    doc.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
with the attached PDF file (created from scratch with Acrobat XI Standard), the 
field is filled in the saved PDF file but the characters are not presented as 
in code. And now the most curious thing: If you click into the form field then 
the correct text will be shown. Very strange.
Is someone facing a similar problem? Is this a known bug? Does a workaround or 
patch exist?
I took a look at the source code. It seems that beside the normal field value 
an additional "appearence" for showing the field value is added which maybe 
doesn't support unicode the way it is implemented atm.

Thanks in advance for any help,
Steffen Harbich

Reply via email to