[ https://issues.apache.org/jira/browse/TIKA-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362593#comment-14362593 ]
Tyler Palsulich commented on TIKA-1199: --------------------------------------- Is this the same issue as TIKA-1095? You can open the PDF, but if you try to copy+paste the text, it comes out as gibberish. > Tika extracts weird signs instead of text > ----------------------------------------- > > Key: TIKA-1199 > URL: https://issues.apache.org/jira/browse/TIKA-1199 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.4 > Environment: MacOSX, Linux > Reporter: Marc Teutelink > Attachments: gaat fout.pdf, > plain_text_tika_output_from_gaat_fout_pdf.txt, > structured_text_tika_output_from_gaat_fout_pdf.xml > > > Tika extracts complete bogus text from the attached document. I have attached > the .PDF in question and also added the plain and structured text output from > Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)