[jira] [Commented] (PDFBOX-1572) PDFBox ExtracText problems with "ª"

Daniel Tizon (JIRA) Fri, 19 Apr 2013 03:41:18 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13636247#comment-13636247
 ]


Daniel Tizon commented on PDFBOX-1572:
--------------------------------------

yep, you are right. I tried "save as text" with Acrobat Reader and I got the 
same errors. So isnt there planned a OCR extension for PDFbox? ^^
                
> PDFBox ExtracText problems with "ª"
> -----------------------------------
>
>                 Key: PDFBOX-1572
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1572
>             Project: PDFBox
>          Issue Type: Improvement
>            Reporter: Daniel Tizon
>
> PDFBox have problems to detect ª in some PDF's.
> Examples: 
> I have in my PDF: 1ª
> When I extract text: P
> I have in my PDF: 2ª
> When I extract text: 22
> I have in my PDF: 3ª
> When I extract text: 32
> and there are a lot of more examples related with "ª"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PDFBOX-1572) PDFBox ExtracText problems with "ª"

Reply via email to