[
https://issues.apache.org/jira/browse/PDFBOX-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hesham updated PDFBOX-1552:
---------------------------
Attachment: pdf_with_uppercase_letters.pdf
This is a 1 page sample file to test.
> Uppercase letters are read in lowercase manner
> ----------------------------------------------
>
> Key: PDFBOX-1552
> URL: https://issues.apache.org/jira/browse/PDFBOX-1552
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.7.1
> Environment: Windows XP
> Reporter: Hesham
> Attachments: pdf_with_uppercase_letters.pdf
>
>
> I have a PDF that when I read its contents using PDFBox some uppercase
> letters are being read as lowercase. For example :
> - Word "Testing" is read as "testing"
> - Word "Eve" is read as "eve"
> - Word "Deuteronomy" is read as "deuteronomy"
> Andreas commented on this by: "The pdf uses marked content to replace a
> string (14.9.4 Replacement Text of the PDF specs provides a simple example).
> And yes, PDFBox doesn't support it, yet."
> Please check this 1-page sample PDF.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira